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1 

im&<v yr^h^^m^-thX o izffi&ztix 

mmmmtf. mtvy-vry^wzt^v^frh 
B^titzm-^^xhi^im-t. ^n<m$xn . 10 
fornix. ^<tt>^ioro^7A§fx^ 
^c*f-f 4§s i <or-^7a-»iffli*3 <w&< t i>m 

2(Q7u?7M.Ztlt:fii^l l zm-&&2cr>T-?yu- 

coruy^^tit^^v^m^-f^ztifZX'yx 
B&zfihmit**viSii^x'$>z>. msmitfficom 

x.y ; jy. 

imm) mzmmmv. mmn?$i$f>wmiit 20 
^ * 0 r F i-* 7 4 FA>£><Das 1 a £ft 
U7 f vx^zm- &m lco^^vT^uxa 
kV^nTvyj&iiiitzt^-fc^zK-t&micn 
*^)TY\sx*m^h£o{zmft-tz>. mm2n 

mm* 1 mm^^nmmsT wxyj- 

mhmiT y vxy 4 -)v v t m t h- -y naa-c'*> 
4, ii*«3ieij^axyi;y. 

[ff*JS5] !5fe«^^Ma«£7FU-X7 4- 30 

imimnmK msmr f uxm^-ti x 0 ten 

M*«4iE»0Wlx>'>'>. 

xi^^ixt^^zmhmmm^y < ->v f* 
*tt, ft^JSiKS^spixy^y. 

[zmhmmK$ft*mth. mm6imnm 40 
inms i fjie«-^^*\ zn&m^y < 

t«t*»w xaif^Rpf . mmsm 

ff)9SMxy s yy. 50 



^2 0 0 0-2 1 506 1 

2 

[bwsr i o ] mm-£$i-$tf. mm* <nmyy 
imm 1 1 1 mm&tfttf* mm 

(DAGEN) y <(-)l'b'frt>Bf&.Ztltz&-&DAGE 

N7-( —jv f ££-f 4 s m*JB 1 wa/wmx.yyy. 
inm 1 2 1 mmsv age n7^-;p f# . ts 

ijco^axy^y. 

enroll 3 ] Binea^iafli^. btjsod agen? 
yizmix . ffiteis^D age n 7 * f j» -7 
4. n$«i 2ie«o«sxyx>. 

[»^14] mi%li5£Vm2<7)* : £VTVUX 

\,zi.r>xm^tihTY\sxfrt 3 &it5i.V&2<r)*'\ 
yyh' itiMlzy x «y f-ti J; 3 CftfW* 7x7f3 

[W*3I15] flufem \HlV : &2<r>y°uy'yj±%1x 

tzn&^izm mxv^cDT-fy q-smww 

**t», IWS9I1 4te«^axyx*y. 
mxmi 6 ) U77-feX^$:BJ5|(^J«fi : 
T* 4 t»Ofc i t C J:-5T^^ y 77-fex^ 

4 . 1 IfifJcoKiax vify. 

mm. 17] r^-fe^AM. «e^7 

X&MZtil , if *« 1 Btt^Wiaxy . 

cit*JSi8] df-^-b'7rr^^LTfrierD 

f -f 7.7V-Y 7^7^ Sr^- LTMtern-b -y ^casttS 
^T^X7U-^i;. 

frtero-fe >y-9-(^^$^**iig« ( r f ) ass 
k. 

^RFE8ftfc^§^7>7 i 7k» 

sr $ 'oic^-o, mm i iawf^^fA. 

Gi&aLxm&x* oMtm&t&t* mm- 

4 . mm i tefjof y^^wf a. 

[11*312 o ] i'yvtetm.WXmx'b'* 

X. 

j»jeoa-&*^^/**t»^^ r 7-f-A'F**tf. 

•y7t. 

fflB^^N' -y 7 r rtO#^«0fflB^ ^7 -f -)V F rtOHfi 
^fc < fc * 1 or d /9 A 3 ft fca^fcH* ft » 1 o 
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[if#JH2 2] ^<i^Hi)IB^^^-&7HIx 

tttiztttim i * u 7 k uxfc xr/mtm 2 cor 
m. 

y**fc*W*lWB7 K1/-.X7 -f -)V Y t PI tt'<y bffi 
»&»&0)t9£a£tf*tf>19iEa£7 VUX7< -)i> H Sr 

[ « 2 4 ] mea&ft 1 o^^tcjfr 5 # 

[ H^JS 2 5 ] ffiBB&fatt>mt& 1 t *«- 

SffrtoM Xftffflrt Zm?& *T •> 7* £ fefc-t 

tf, if«ja24ie«co^ffi„ 

35 2 07D ^7 A * 'J ffo^r- 9TYVX% 

£(DAGEN) 7 4-Jl>\!i)>t > B!SlZtll&&DAG 

[ffl*JH2 7] fflfEtS^DAGEN7^-^H*«, IS 
£7HUX7*-/I^V)H»«:#lfc*-4s BH&H2 6E 30 

CI«*3I28] fl^a*****. WJ&ODAGEN* 
^CISS LX . HtffBte^D A GEN 7 — K SrflW 

ff*«2 6Ef^:/7£. 
[ff*«29] mifeJ:t/m20>* U7HUX*>fe 

?yF«:J^fc7xy*r*.Xf--y7*&;&fefc:*tt, If 

#«2 2iBao*a. 

[»3l<3S3 0] WE«^*^Sil*JJ:l«B2<07 , o 
?7J>2titz$i$l / zm'&mit5j:Xfm2cDT-'77n 40 

2 9ie«<o^rffi. 

yj&ntitz* * u t frrai^MK'!)7-b y 7* u <o 
*S2 ne«co*&. 

[ff*JS 3 2 ] 13£*S£*r » 7>\ 
tXT-y7b, 

tWi^tLti7nVy^iit^^:V^im^LXM 50 
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^*U#^Sr»«r6^f-j'7 , i:, *3£(c*0, 
[000 1] 

[3W)i*>ll*&tt*aif] *»fltt. mxyiSyt. 
[0002] 

?n7D-fe -y *r?tift*§ffl£&rtl £ t tffctefiT v > 
6. £Oid&M?iJllff£ff3£^£<<0ft&l>7- 
*t?^x 7**20 £>ft-0^. MWafrtcJ: 9£*«ja 

A'. y 7 r t4i 6*ifc<ofcMWfc:a*3frC , Hffxx 

>y MZf^ ^N*-yf-$il-?.. 7^DrD-fe7-y"li. f 

ir/yh^i 7 J-Uff-T ■& tabled ^4?7;i/-7° -y 
tm^.0)V7h^^TT7Vr-^ayizmtXim 

-h-ri»^fc. m\mf0)^tff)X7 : J 

[0 00 3] £<^M&S:yMTcDMgxyi;y#2n<i> 
iiTiJ "9 , v-f 7o7o-fe v-WUtftt-Mfc-tttt 

OtyfiDSP) )W£<ttMvoV&. DSPii. A 

$>£oizffifRzti. ztizmssitzKMz. xomit 

[0004] 

v>*<, TiS9)Wi^7a*yy<n£d%mix.yiSy0) 
[0005] 

HJX , #-coMr*^1S-^^^yN'-y 7 r t S «t o (CM 
fW*^"«'y 7 r k , ft^N'-y 7 rj&»fe<0««*a^ 

nfyiz&gtZko izffif&ZtLXti*) . ThlZ<r>97 

mzimix. ^<ktssio7D^A$^^ 

❖fcStt S ^ 1 c7)r- ^ 7 OHHW3 J: MB 2 <07 d / 
9AS*ifc«^fciW4»20f f -^70-M«|^a^ 

[0006] U^'ot, 4W9K4tltfllli, S'J«07 
o^5A$ii^^5:jig^-ri.i t lz£ iX&fcZilZ, 

(iztm. 7^y7)UttiiiayjU)V) a^^tc 
lB?S^-6a*lll«*a*t*. iOidfcUT, J5Sx 
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5 

^mzM.mzmmix^m^y^yco-kwx^-rv f 

-it. *-#*i:LT**^'»7Trtk:flH*8*tTV^ 

[ 0 0 0 7 ] a w\7 F t'Sd^^flF^-fb^'*^^!! 
^a-^^co^^^'-y 7 7 from 1 <7)-*t<7)^^-T' 

[0008] *»E*S«tffCtt, *£*rfM±* S'J^<=0 

LTJgaSti*. ttJ&SiaiWCtts «£*^tt7VF 

* r * ^7;p l-c t -fhzttfX'Zh. 
[ooo9] m Kit. m^mma. a-&&^rtoa 

^ =E V 7 F J^CO^S 1 nruyyJxZixtzX * 'J 

r f i/x#^tc*w*JB 1 ^ * u r f pxfc itf * 2 
orn ^5 &%fitzx * y ^mcjw 4 » 2 co^ t y r 

£7 F 1^7 ■< -/H**0\- ^fitzT x7 

IV* * 'J dMMcifrf & 7 Yvxy -c — ;U F t |S| 1 1' •/ F 
fiWC-JMitf, - ix«i^x;U-7 7 F ttff * L i ^» 
taartii:*^**. £0*6. 7Yv-x<r>m.mt. 
ifr^&f1M^#T*7/Hir4W8S 1 13 J:t^3S2<?D#^ 

[0010] «£<r*fc*tLT«fcS$rt> FRfcifcW 
y7FU*ttnBB7FUXi: l/CEflS*u -Hit J:-? 

x. ®^mmit*v>zo%tit$£itt?2>m&T\ ( i'A$: 

a-fTS i a K1MW4 rtttt? J: ^.fi 7;HMMi* 
-ft^J: 0 i4«5r^3i-7V 3 v*+TX- F-f Sore, 7 
FU*fc#f-fS;KXF£SS7 4-/1/ FWMXfclB'h 
L, -eixtio-C, 7FK*S#ta£®$:t'-y F»Sr« 

&Zt&X'Z&. ■ 

[0011] ^tyr^-tx^ti, ^VN*>/7rrtc7) 



4) ^2 00 0-2 1 50 6 1 

6 

77 Fr^/UHHi, 2owty^i: 
-7/1-7 * F*>s&BttU:lll»$*u ffi«<o* * y ft 

odf + 7 a S 7. S <K« L 77 'J v- 3 

[001-2] *^dO*2<0**fcWf 4?3-rii* 
fc. »ltf)tWM=Stt*-fa-^<W7*-fe 7 Fi: f 4i 
10 t^'T#. &Wj:imm07^ZT-hi3£V : !m 

[0 0 13] ay/*? h«^7*- V-yFSrSft-fS 
tHC. 7F^7^-;WFSrA-F*a^ifir^C*W- 

f3-^lt ®^^^7^-;PF<7)^tiJCJC 

20 a^i-4ik*«*cs*. 

[0 0 14] h* 7 FjRfcS ^fcJjLH-Sfcftfc. a-^ft 

❖a, FJr«^a^ft^^^< fc t>gi 1 wft^tcitti. 

«W^«fm*7 < F i*f^^-7 -f FA* 
m^7o^7ASftfcft^§ft^7 ^ — yi^H<fc 0 
i ^h' 7 f z$ts x 0 iz-th z t h . ^ * y 

R£«jH-4i «#W«i, Brjeo^^C 

je#L-ca-^ft^^m 1 ^cMtww 

[ 0 0 1 5 ] JtifiU: 5 4 J: 0 . Bj«0« 

r, ffi^ft^t*a^*7^— -7 vhtirf *]*£«> 

[0016] 7D^ 7 A$il^#ft^* { T-^7FWX 
^4(DAGEN) 3- F 7 F 
it. mWTuVv A %tlfctil<$<?MWl<r>D A G E N 3 
40 - F £ a^l8tfi-rtO*£-& D A G E N 3- F 7 4 F (C 

iS^a^fci^im^ffo^fc^fiS. £-&DAGE 
N n- F 7 ^ -IV Htttt-&7 F 7 -f F<0-^Sr 
JBflWiCi:**^**. te^DAGEN3-F7<-;P 
F#4x ^flS*B^fc:tt, «^W«4Hf«<0D AG E N 
^ LX&£T> A G E N 7 >f -)i> F 5r fgW4 C 

[00 17] jsny^'ycii. ^iteJ:^20^ ; E 
>J 7 F U^fci -^T-?-^-fiX^S l JS^S7 F VXfrh<T> 
50 aiitiJ:t^»2tf)* / <9yFSrafeMt:7x«yf-f4J:d 
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fci&fFf 5 r-7 7 x -y f-3 y h o- y SriftttS £ fc # 
TS*. t-*?4 ba>ho-5t. g51fcil>'SS2 

[0018] **««silt«TJi» r-byy^^v^^ 

LT\ *5!ttteWJBT*6^n» Mc*ff £*ra^^£ J ?• 
;^.&rfc;^> i ••e£?>. >\-F»&ir«tt. 10 

^fc MWCllfft 6 i fc I . 

[0019] Jfitf . ±j*U:«I 

iyy>m7ot7t. fcfcx.tf#-f L*>*3?* 
4*fiWi=firv^**7 f ^7/HI#7°a-fe vihWIttSiti . 
7-o-b .y-tii, it fc ;U2tt£Jft*ftfltQlli (ASIC) 
<9 i 3 fc LTIBHi- 4 - fc £ , 

[0020] JdaUfcJSlxy^ySr^Of^^Ml^ 

KMvxfAWi:. ao^oro^Asnfc^tu*^ 20 

LT * y ffH?£ i 0 CIWW § ft 

tiinyAM 7-^7-fe y 7*5 &fc* fc -fS ,1 fc , •?• 

vti^msx-z &frzmt?>*ofcm&?iz fc*< 

[0021] *%BHwS4>-tft«o»Kfc:J:iUf , r^*^ 
/Wt^JQ®^XTAffl<7)^7'U7 , n-b-y9-*5*|«t§ 30 

* ';ift^*»<9<oa^^ ^^ir-tyyjut & i 3 fc 

[0022] zzx\ 'r^ry'ra-bv-tj tv^ffl 

■«"tfcb%«**35^t4 ;ws itf/t fctt7* 
;^5ffitO«lfltS:#tft<0fcJ£<)B8?-r^$ifcS:a 

[0023] trttrvra* -y-^ti. ^fc t\if-m 40 
h. 

[0024] *mi<?)*<oib<7)imi,z Xiitf , jisiiy 
Tt^ASfifc;** ytSM**^T*y7>£ii£«£ 



^2 0 00-2 150 6 1 
8 

l(07O^7A^fL7t^C*tt&miOT-?7O- 
4. 

[0025] 

[#»059l*»B!tt] «i, fcfc*Jfft5£JH&*m 

ass (as i o wzmiiitihT : s?)vm^?v*.v 

■f (DSP ) fc1ffcfEffl$ftl>fr\ tttfWttftOWaxV 
i^yfciJfcfflSflS. 

[ 0 0 2 6 ] 0 Hi, *»W>-HitWt*f 7 
ornt^tl O^yo'y^iaT'S)!.. -r^oru-te 
•y?l Ott, fy^Hlf7Dt7t(DSP) T». 

^i o<r)*mi<n-%wm:WM-thMzmfo<r>bh& 
tixm. mx-mmzm^hhz t tfx-z h . tztt 

tf . 7 l/f »J 9 • 7- h V K 60*Olttft* 5 , 0 7 
2, 4 18-*fctt» DSPfr1$jIfcEiK§;h/efc , 3. * 

&*>#IBI*ilF85, 3 29, 4 7 1-ffcii. DSP^f 

*ffifi<Q-~&kLXZZIiZmMt&. v-f 7 07°n-b-y 

[00273 A.f ; fc * 5 t-# s 

mzti&*mp?m5 , 072, 4 1 8-ttc, «fc* 

la^fff^S, 0 7 2, 4 18-f«OH2~01 8tlS*8§ 

WmiK5. 0 7 2, 4 18TOe^§^^XrAS: 
$^{ctt#fl»^fc*^l». -?-c7)J:o=5rxXTA(i, 

XtA, t-7nyho-;i-, ntf? h?y ho-;!-^ 
7.fA, '«Mi|^afl^XT^, xn-df+y-byy^ 

[00 28] 01OV-f ^oro-fe-yHfO^^^^T 

F9t««Afc:J:i«HFffiIW9 84 024 5 5. 4 
^ (TI -28433 ) fcfeSJSfrCfcO, 
®kLXZZlzWffith. 

[0029] mz. *SKBfcJ:47 , o* vW-mcr)& 

BJ^ooASWSrHliW^^-r^ro-fe -y-t l oco 
^*b&I3T'S)S. rotyti 0{i. jllxyyyi o 
0 fcro-b y^N'-y 9YU~y 2 0 fc MM/cv**. * 
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mmmx-u. m-t-y^ra. ®%m&mnm ( as i 

C) (c^S§itJtr^/Mt^o-t-y9-l QX'hh. 

too30] muz^txoiz. #uixyy'yioo 

14. JHI3T1 02i:J(!SJl371 0 2&WI371 02 
'J 4 y ?-7 x -f X-t%b*> ( §m3.- «y M 0 4 i: SrW 

^ s+ifesrasis (cpu) mm*. 

[003 1] 7D'b7tA'7?71/- y20(4. A*y? 
7V-yA7.2 2£-fr^. ^tLtiiMaxy^'y^^ 
>j1=aa--y h i 0 4#g$g£*ro^. A 7 ;rw 10 

y^*X2 2C{J, ^Jf^'yv-A^tU 24, J^SSgB 
2 6fcJ:^l-^y^-7i^X28t^$fLTV^ 

[0032] flfeoiiMWTii. m%mm$kv/&tz 
nm&mzm ixx^wzmmx-z t z t vm> 

hX*hh 0 . ££x.(f. *&JIxyy'yi 0 0(47°O-fe>y 
-tl 0£B&?&ZbtfX'%. Tu-tvWiytTV- 
y2CMZZfrL,ftmtiX^&. fflxy-zyioo 

im^vt-y x4xmft-fh^,7yu-y20 20 

^tt£LT*^±fcfg»£ft£DSPT'&9f#l>. ® 
gxyy'yi 00(4. tzbUi. DSPT'{i5r<v-f ^ 
nrD-tr-y^t-ri.^i:*>'T-#. A S I C$ffiUW-<08 

hZbtfX'^h. 

[ 0 0 3 3 ] H2 (4. M3 7 1 0 2<7)-»£0lJ?)a# 
flfjl£^-f. m& Xolz, JftSnT 102(4. 

4-?<7)jgS. -f&;b*>. ^A' 7 7ri-7h ( I XX 

7M 106fc3o<y0||ffxx.ybt££A,-ev^. II 30 
fiXx.yM4. rn^7A7n-xx-y h (Pi-7 
M 108t> TKl/Xf-?7D-a-7h (Aa- 

»M 1 10t. ^N'.y7TXX-yh ( lJL-«yM 

1 0 6*^m^S^#^$rHffU7-D^7A7D-$r 

$IJfflI*^K«tl)T-^lt»XX.yh (Da-yh) 1 

12fcT*S. 

[00 34] 0314. JH3710 2«Pa-7M0 
8, Ai-7 I- 1 1 0fc4l/*Dx^.y b 1 12 2rP*HtC 
^"tfc &0:S:-J71 0 2tfD£££'£&giis£g^ 

•rS^'XffljgSr^-r. Pa- ••/ M 0 8(4. tzb x(4. 40 

/k-r$ijffliiisst . GoTo/mmmmt. utr- 

b^y?y^7;y-:fc4tfi0&*vx:?, 77^i^(4 
hiuuisxtcoi. o%7u?y£.y a-£®mtm 

mtmtZ&tcWXftZ^^X'^h. PXX-yh 

1 0 8(i. rmr-fyJ h^x (EB, FB) 13 
0. 1 32tT-?V-V^X (CB, DB) 134, 
1 36tTK^^<7. (KAB) 14 2t(C^§ 
fi.T^I>. §^>(i. Pxx-yh 108(4. CSR, AC 
Bt5£VRGDbyOl>ZtltzZ&X&&<XZjtlX 
Axx.yM 1 0fc4t/D:z.x.y h l l 2ftcr>y7JL- 50 



#112 0 0 0-2 15061 
1 0 

•yhlzm-SZtlX^h. 

[0035] 03(^t 4 o K. *HlS6^K'(i. Axx 
•y M 1 0(4Uv^^7r-f^3 0fcr-^7KWX|6 
7xx >yh (DAGEN) 3 2 i ftlfifc 4lA'Ha>l 
ggB(ALU) 3 4i:£-&X,T^&. Ai^»/M/y 
7.;?7y^3 0(4$2?Sftl'>'7.?£-&*. <Hx£> 
+fc(i. r HUX^^'(fC'^<T-^7n-(ci,^ 

T'#l.l6t' yh^y^lxy*X^ (AR0 

AR7) teXlfr—rUi/X? (DR0 DR 

3) ££(;:. 1x^x^7 r-f;K4. 16t' 7 h 

mW* -y 7 r U i?X 9 b 7 h* v h -f-?^- i/V i>X ? 
b%i$A,X'\">h. MM^^X (EB, FB, CB, DB) 
130, 132, 134, 1 3 6tmX'%<. T-?5t 
iStA'7.1 4 0fe4l^r Kl^X^^'Xl 4 2* { Axx>y 
M^7.?7y-r^3 0(C^§nTV>|.. Axx-yh 

1/ y'7. ? 7 r -^1/3 0(4. -e^-rngytfr^tf^irti. 

l^ttA'7.144, 14 6t4-5TAxx>yhDAG 
ENxx.y>3 2(C^§ilTV^. DAGENxx-y 
b32(4. 1 bV-vYX/YVUXfb. tztlimm 

xyisyioom7vux%!kz®mi&m-iim 

H ZT/X 9 -y 9HM y 9 V VX 9 b i-kAsX'^h . 

[0036] Aa-7 h 1 1 0(4. tm, mn&xvA 
nd. oR&xxsxoRmmMMttt'vALuizmm 
mizmmtm&mx'%< y7^«t#^ALU3 

4i3-/CT'U|>. ALU3 4(i. iSim^X (EB, D 
B) 13 0, 1 36tiJ:lftii^%.mT-?'<X (KD 
B) 14 0(Ct^$^T^-l». Aa-7hALUIi. 
Pax y h l 0 81^^*7.^7 r^)Vi)^V : JX9^i 
Sftti> P D A^n'X(:4 ->X Pxx-y h 1 0 8(C^$ 
tlX^h. ALU34(i. TVl'XteitfT-fl'iSX 
?ft®Z%it?&><XRGA, RGBtUi/XfyrJ 
)V3 0(07 VUXt6±X/T-?l' : JXf<,Z&&t&rtX 
RGDfc(C4r>TAxXy h\s : JX?7T-i )V30{Zh 

misztix^k. 

[003 7] 0*^^1)4 0 (C. Dxx>yt>112 
(i. Da^7M/y'X^7r^f;l'36t, Dxx-yhA 
LU38t. Da-7hx7^4 0i, 2o<03ggfc4 
^fl:i-.yh (MAC1, MAC 2) 4 2, 44 t£ 
^T'V>&„ Da-.y Mxy'7^7r-f;l^3 6tDxx 
vhALU38tDJL--y hv-7^4 0t(i, A'X (E 
B, FB, CB, DB, KDB) 1 30, 1 32, 1 3 
4, 136, 140(Cfe^$iX. MACxx-yh 
42, 4 4(4. s*X (CB, DB, KDB) 134, 1 
36. \A0bT—?V-Y^X (BB) 144t(C^ 
£SivC^lw Dxx-y hU^X97r^)U36it. 4 

0h--yh^»§§ (AC0 AC 3 ) 1 6 b'-y 

hffl$U : JX?tZ-£A,X'^Z>. Dxx-yM 1 
2(i. Axx-yhl 1 0<^1 6t' 7 h^y^*54^r 
-9WJX 9 i V-X b LXmm L?t 0 . 4 0 t'-y 

SfsoffitrXT ^-y a yi- isxfzmm-fh z b 
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1 1 

tfX'th. D:x--ybl^X? 7 7^3 6(4, £gf§ 
5 A b^X (ACWO. ACW1) 14 6, 14 8^ 
LTDJL--y hALU38fcJ:^MAC 1&2 42, 
44*><i>, (ACW1 ) 148 

SrtflTDa^y 1-^7*4 0*^ r-?£gfit 
4. T-?(4, JOSy-FAX (ACRO, AC R 
1) 1 50. 1 52^LTDJL-- y M/^^T-f 
/l#JHW»&DJL-vbALU38. DJL--yh^7^ 

4 0fcJ:^IVIAC 1&2 4 2, 44K»*aj$*l*. 
Daz 7 hALU38£Da--y t^7?4 Ofc(4, E 
FC-, DRB, T>R2i5i.lfAChby^)V^fltz^t 
Zt%^X£-frlXA3---y b 1 08Wr7*Jt--.y MC 

[0038] 04£#Bft4fc, 3 27-K#4M<*7 
t^j.-(IBQ) 502Jttf(fe^A'77ri-7h 
10 6A*^$:frtV>4. IBQ5 0 2J4, 8t'-yhAM 
h 5 0 6 fcWiWfcftilSflfc 3 2 X 1 6 b' y ^*X 
?504£-gvtTV>4. f*HH4, 32t*-yh7-0^7A 
XX(PB) 12 2£tf-t/CIBQ5 0 2K£i|#-t4. 

0-*^5-f Y7U^y Ktl^V^ (LW.P 
C) 5 3 2fc«koTfcSSft4ffiWc3 2tr-7MM? 
>PT*7x>yf-$^S. LWPC5 3 2J4, PJU=..y M 
08fctM§ilfci'5^*fcfc**VO*4. Pa-7h 
10 814, O-^U-Kro^A^^y^ (LRP 
C) 5 3 6W5>**i:9>f h7a?5J*&<»? (WP 
C) 5 3 0k^X^fcJ:UfU-h*7'o/7^7>-^ 
(RPC) 5341^X?fc£t>3XT*H4. LRPC 

5 3 6ll^f3-^5 12, 514fcO-K3*V5 
ifcO— 5i ^ldWR»4riM> I B Q 5 0 2 rttf)ffiil£f& 
ijrfS. s LRPC 5 34(4, f3-/5 1 
2, 5 14t?fi£f , 'f^'yf-S*lTV^44r4M)I BQ 

5 0 2rto{aa$rf^-ri>. wpc(4, mr^-fyfc 

f-£*flT, Tu:/5A;rft'ja^?)&?)4^ h*<ifr 
^ItetttBSffifc^x-yf-Sft*. RPC534(4, 
t3-^5 12, 5 14fc$£r-f tSJvC^S 
ft^7n^5A*tyrttf>7KUX3:£*r$-4. 

[00 3 9] ^(4, 48h* y h7-H(CJMSil. V 
/H-71^^5 2 0, 52lSr^-LT4 8t*yhyU5 

16(c4oT^r3-r5 1 2. 5 14fcn-F3;ft 
4. iSM£&&^lf, *^J44 8t:-yhJ3Ut»57-Hfc}B 
Jfrf4£ *»WiHEUSl** 

[ 0 0 4 0 ] ><X 5 1 6(1, ffittf) 1 #*1M 
C, rn-rafc*) lo-fo, ft*2otfrtM*Sra-F 

WTOct>Jt-pTa-&-ri>8, 16, 24, 32, 40 
fe i.X/4 8 t'-y htf)7 y bcOffiSOM^t t4 



(7) ^2000-2 1 5061 

1 2 

£ i: tft'^ 4 . 1 1M ^/I'+t 1 **U*>n- KTI* 
WJ^fcd, f3-/l, 512*^3-^2, 514 

ff-t4^C, ^4^(4«** { Hff§fL4^# 

tirfrrz-yizmztizmz. ham hm% 

±X'7y4 >§ft4„ T^^yHl, **>«#+fctt 

*rfi4. am N*t^S:^r-f4^<5or5^^yb^raa 
io -thzmm. -?m-7\s?-*5 2 0, 521 t-i^t§ 

ft.4. 

[004 1] ra-t-yfar 1 0 2(17X7-^^7' 
5>f y^U^SWfL. -?-iO#XT-^(lH5Sr 
#.^LTittaj$^4o 

[0042]^77^fy<7)SUf-i'll, PRE- 
FETCH (po) XT-^20 2Xh t ). ;«xf- 
^^)^y^-y^xttz\t^) , m^=. 

7M04»rHUXA^(PAB) 118±(CTh'!/ 

20 7VUXt$MZtlZ>. 

[0043] ftOXf—iS. FETCH (PI) Xf- 
x'204T'(4, T'uyyA.XtOtfffiX-ftZtl^ II- 
7M0 6i««'Jffl«7M 0 4*^PB/n'X1 
2 2£tf-LTft»$ix4. 

[0 044] ^>T75-f >-{4PRE-FETCHi34tf 
F E T C H X r - 1 8<J 0 & 4 fit JKfcT n / 5 A 7 
n-^+BlltT7 0 n^7A^^'Jl*|c7)fficr,#^, fci:*. 
if 4HKf&**tSjjttS Z t 4^T', P R E - F E 
TC Hfc4t/F E TC HXf-y'lll OWM 754* f 
30 Xf— v*^?tiLTV^4. 

[0045] <Kt, ^A*77rrt^^W, S3 
Xr-y'DECODE (P2) 2 0 6T'r3-^5 12 
4fcJ4«BR^3-^5 14Kf -fXX>yf-$*U 
T, ^(4, ffl^S^T, ^<^*»¥1-4S«Ti- 
-yb, fcUlfPl-7M08, AJL-.yM 104 

£(1DJL- -y h 1 1 2fcf-(^'!7f§W. fS^-Xf 

-x'2 0 6(4, ^<0^7X5r^-fBl^^i:^^ 
7*-V-y h*SWB2<0«5»k«^tC*W4rKU^ 

40 HKS-S-f 4 t £-&A,t'V ^4 . 

[0046]'»:OXf— x(4ADDRESS (P3) X 
r-x2O8-C*J>0, fiT(4, #^-rtT*ffiffl$ix4T 

-?orr y \sxim%^ivhb\ irttifiTQyjjjm. 

^YXsXmnZtlh. «-lf»4, Ai-7 M 1 0 
4£(iP;x-7 h l 0 8T*<t-f^ttfWt4. 
[0047] ACCESS (P4) Xf— x*2 1 Of 
(4, U-h'^^yHcOTHUX^tiJTJ^itTt^, X 
m e mBScT K l^XfiS^E— H SrW"f 4 D A G E N X 
50 **¥T7 H UX*«^$flTV>4^t 'J t^y V H 
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(8) 

1 3 

if. ISHKTKl'.aiSeSfufcX.Xty (Xraem)H 

[00 4 8] A*4 7*5 A y<7)&<7)XT— READ 
(P5) Xr-x'2 1 2X'h Y ). ZZX'te. YmemS 
»7F]/-Xj|»&E-1 4 *#tftDAGEN YSW^rt 
*fcttffi»7Fl';* J E-F«:fl , *6DAGEN C*«g 
Wt'7 F I^^*H%* S fit H £ * * y V F #\ 

»»ais*ift. *^*>*8»tf#&&**ift.x : E , ./ffiM*> 

7FPX#a}:*J$ftft. 

[0049] fa7^7?tX^t:li U-F:*<* 10 

m.i-&zti>x'$h. 

[0050] milZ. WAa^ y M 1 .0 rt* fctt 
DJL-./M 1 2flT'llfi ; £iXftllfi : EXEC (P6) 
XT-V2 1 4#aft. te*#r-*U'j'*X** 

tz.\m%mzm&tihi)\ y- hat ^ 7 r .4 
-y Fja*fcttxF7^jfla.x*yfc»*&**ift. $ 

felC. v-7hffi**5EXEC^r-V*^S:fsf*lOT 

-?£&§ftft. 

[ 0 0 5 1 ] mz, r? a yra-k -vynrnttcom 20 

ft»ftJ:3IC. S1^^3 0 2{C«LT. il^^75 
4 y*^-5^IBTi~T7(c;bfcoT1?;b*i.ft. «JH 

fSKiTO-fe -yff Vv-y ^ D -y 7 K*ttft tV'/t^J? 

/VC&4. wmiS*i)WU77 4 yxf—'Jizwa 
LTHftfc*. S2W#^3 0 4tfM§lTiX''U 75 
-fyfcAft ft. S303*4*3O6fc#l/C. 

PRE-FETCHXr-^'2 0 2#$|fT3Tfi : £>ft 
ft. H6rt>£>#*>ftJ:3C, 7Xf— VWrVA yliZ 

nix. GttTzm+iMmzwm-tizktf?* 30 

ft. 70(^3 0 2~3 1 4£>£TK#tLT\ H6(i 
Jirf ft. 

[ 0 0 5 2 ] 07 fcjjrf J: 3 (C, *#yj*)£«D£ifcW 
li, 2 4 h'-y F7 FUXAx 1 1 4&£tmijfa 1 6.t' 
yhf-^Ul 16Sr^-LT^^ ; &yi-.yh 
H*> t||^8#i*>tUWx^-» F 1 04fc£//C 
Uft. ^ ; e'J'ffaa--yhl04(i24h'-yh. 
7 FUX/\'X 1 1 8fc«fctf 3 2 t y F^ftr-*'*.* 40 

£*lTUft. ^t'Jfla- 7 H 04ti3 2h*-y F7* 
nm'J-hVU(PB) 1 2 2Sr^LTVi/>7'D 
t7t37102«U-7 F 1 0 6fct>l£&$*rCV> 
PJL-.y M 08, F 1 1 OUXX/Djl- 

•yM 1 2(iT-^y-K*3J:tXT-^7^h/^feJ; 
lW^ft7FW*'tt*tf LT-X*'J*a:J.--y F 1 
0 4K3S-&2;ft.T^ft. Pa_.yM0 8li?^C7o 

[0053] JOBHKtt. Pa- 7 M08li24h' 50 



HI132 000-2 1506 1 
1 4 

•yF7n;/7A7FUXAxi 2 8t2ocoi 6t'y F 
f-;7^A'X(EB, FB) 130, 13 2fc20 
<7)1 6h'y Ft-^'J-Fa'X (CB, DB) 134, 
1 3 6fct«t-7-C^ ; eyilii--y M 04(C*S^§ix 
TV>ft. Aa- 7> 1 10(2. 2o<024t'7hf-^ 

54 F7FUXAX (EAB, FAB) 160, 162 
t20<7)l 6t'yFT-?5-f FAX (EB, FB) 1 
3 0, 1 3 2 1 3-?<y)r-^ 'J- F7 FI/XAX ( B A 
B, CAB, DAB) 164, 1 66, 168£2O0 
1 6t*-yFr-^y-F^*X (CB, DB) 134, 1 
3 6kiftlX* ; £V f m3---y F 104fc^3*tT 
^S. Djl- .y F 1 1 2tt, 20^f-^5^ F^x 
(EB, FB) 130, 1 3 2k 30tf)T-?U-FV\' 

X(BB, CB, DB) 144, 134. 1 36fc2r^ 
LT;<* yf Sjl- . y h 1 0 4fc*££§;fVO*ft. 
[0 0 54] 07tt. fcfc*tf#tti&4tel6iS-$-6. I 
y F 1 0 6HPi^7 F 1 0 8's.<7)tir$<r>ffi&£ 

mm^ri2Ax-m^Lx^h. am*, ia 

-•y F 1 0 6*»<oAJL^.y F 1 1 0feJ:^Da-7 F 1 
1 2^\<7)f-^waii5:#Bl»# 12 6,1 2 SX'ttl 
?t&7jkLX^l. 

[0055] *W%<V>Z(r>mm\X'\i, JOaxyj/'y 1 
0 0JiV><O7t)>c7)7 *-Vy FT'Vv-yffcS-fcrj&g-?" 
ft. £*?&fr7 4— Vy FO£<y}Jd£<Sr^0lj£jil 

[ 0 0 5 6 ] 8 h' y FtStff : OOOO OOOO 

(MMAP0) tfzlVJ-Vtf-hmffi? (readpor 
tO) i«)J:3=fl^tt?J4*t:8hVF»fm 
* (OOOO OOOO) **tW)»Tft*. COJ:? 

&i§3\ M«tttt>f yry ^ -y f x-hh . 

[0057] 1 6 h'-y F^ : OOOO OOOE F S 

55 FDDD 

F-i/*X^£OI*l§ (7ti:^tf. ds t ) ^*-?-c7)U-^x^c7) 

m<r>?m ( d s t ) t y-xu-vx^cortg (src) 

[0058] 

[^tl ] dst = dst + src 

[00593 n-miBtb-t. 

[0060 3 Z<?)£o%$iWZ. lb'-yF/^WM* 
-7;l/7 F ( E ) k 4 tV F V-xixi/x^^"] 
^ (FSSS) t4t'7Hf'Xf -f^-xayUyX? 
Mm- (FDDD) k tttl 7 h'-y h»ftm (OO 

oo ooo) 

[006 11 1 6 h'-y FtStfS-: OOOO FDDD PP 
PM MMMI 

ZtM. tzkltfTXT-t^-i/ayUiSxtcrift® 
(fctitf. dst)4««iJfll«rtS!(Smem) 
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[0062] 

[&2] ds t = Smem 

[0063] 1 6 t'-y Fffo^i 0 1 O«0M?i>2> . 

[00 64] Z(7)*0%til$ii, 4 t'-y F*£f£*H§- (O 
OOO IHh'vhf^fa-i'ayi/ i?Z?mW\=? 

(FDDD) t3t:>7hmy?TVUX (PPP) i 
4t*-y hTFl^SOef (M MMM) fcl«/ISHg7 
F^T.4 Vi/*r-9 ( I ) fcS-^T^I). 

[0065] 24 h* 7 Mft^ : OOOO OOOE LL 
LL LLLL oCCC CCCC 
£*U4, 24t*>y FfSHK fcfc*tf*fHHWfr**$J:tf 
^**Stfc$iT.4*§&?)*7-fe-yF (L8) t 

[0066] 

[£St3] if(cond) goto L8 

[0067] ey-mzmbt . 

[0 0 68] £*>J:3s8rtMMi. lt'y Fa'^WM* 
-7/P7 4-/UF (E) t8t'7 (L 
LLL LLLL) b 1 tf-y FfWIMfrf' t£3fi (o)t7 
t'yF&ff7.f-^F (CCC CCCC) 
7t' 7 FtlflM*# (OOOOOOO) £#X/t'V^„ 

[0069] 2 4 t'>y FffHr : OOOO OOOO P P 
PM MMM I SSDD ooU% 

saw* (AO io<o*jwHkows (ac) 
.ty^ * y ttsort^ ( mmM>wh & > <7>zm^» 

£A*>fcl6*i:&<K (DR3) 

awe* * y ttuwrtg t & s t'j^yp* 

[0070] 

[fiC4] AC = rnd (ACi'Smem'Smem) , 
DR3=Smem 

[0 0 7 1 3 Oio 1O<O0HTJ>I>. 

[00 7 2] £*>J:3$r#*li, 8tf«yM*fBf9 (O 
OOO OOOO) kSh'-yh^y^THUX (PP 
P) Ut' 7 f^FW^H? (M MMM) b 1 t*>y 
FUS/HgrFWU yy>-^7<-;^ ( I ) i: 
2 t'-y F ( S S ) fc 2 t'-y hfXf 

a >-*JHMWfF ( D D ) fc 2 t'-y hflMWf- 
j£36 ( o o ) fcjgfr&fr? 4 -/H< ( u ) b 1 t'«y Fft 
mTis3>7 4-JVY (%) 2: 

[0073] 3 2 t'-y Y$i<§ : OOOO OOOO P P 
PM MMM I KKKK KKKKKKKK KKKK 
Ztlte. 32t'yFl&^ fcfcitf-Xt'JffiKSme 
m) co-Jltt (K 1 6 ) tO^JtKtlBt'CrX 
(TCI ) *»l*fcJiOKRJgS*i*(dr 

[0074] 

[&5J TC1= (Smem = = Kl 6) 
[00 7 5] THAT'S)*. 



9) ^2 0 0 0-2 1 506 1 

1 6 

[0076] £OJ3$r**tt, 8ty hJSfftH- (O 

000 OOOO) fc3t'-y YTMyfTYVX (PP 
P) fc4t'-yF7Fl'.X£S? (M MMM) fc 1 t'-y 
\-W&/W£TYVA4 <-)VY ( I ) b 

1 6t'-yF5g&7 4-;UF (KKKK KKKK KKK 
K KKKK) iSr&X/t'V^. 

[0077] A — Ft aT;N^ : OOOO OOOO 
XXXM MMYY YMMM SSDD ooox ss 
U% 

10 iftli, rA-KfaW^-fe^j bn&Zbtf 
•C#|.3 2fc*-y hfiW^tX^. ifctt, ftfc 

7/hft*Tft6A-F7'a^9ASitte'7*ATyw*y 
i<0ia«rtfir*tt2o<0DAGENaaH t 

20 [0078] 

[|fc6] C=rnd (DRi'Xmem) , 
Ym e m= H I (AC.«DR2) 
D R 3 = Xme m 

[0079] 8h'«y F$fl3?^ (OOOO 

OOOO) , 4 t' -y F 7 F VZ&grf- (M MMM) ft 
#3t'-y FXmemtf-f VfTYVX (XXX) , 4h* 
«y F7FV.XS55 : P (M MMM) ftl3h'7bYme 
miKAyfTYVX (YYY) , 2 f-y hV-^7** 
Al/-^ (AC)iBff (SS) , 2t'7bfXf^ 
30 *-s/ 3 V7*aAW-* (AC) BHHP(DD) , 
3£-yFftftf*t«SB<ooo) . FVF77t'yF 
(x), 2h*7 VV-XT*3.J*\s—?ffl'\=F (s 
s). lh'-yF^7^a7^DR3iegf7-f-;l/F 
(U) nXl/lt'v F ^7^3 7^**7 -f-^F 

(%) mt^t. 

[00803 @8(i, ffe^*fi5j:^y7 hfa7/HS^ 

**fciis Jtwmio^«^fc>tys^f*4. as 

40 A>4rttth*>tifeirfrk LXffif&Ztih Z b 

[0081] ^n^2<D&MlZ (-f&;b*>, **<?>«}: 
9KV^7*a^5A7F^K»LT) iH$ilS«fr 
tt. --*ftf>*WM>» 1 *>#4*i:£*lfc:#££33lrC§ 4 
*>£'? #>£jjcWM *-7/k7 -f -;UF ( E t'-y F ) 
&4tAse^&. M&M*- 7;Wt'-yF{i, 

7*- V-y hit#*^0r^tfO3f7-fe-y FT'ieSS^S. 

f'3-yti. •♦HffttHW-ifcftC r Ej t'-yFt= 
50 [0082] <r*«T*»;{c^ * 'J iif^Sr § -^§81* 
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»4, ro-fc V9ST YVX^x^r-i/iz 
f3-^ll «^7*- -?-yb£*n<o 

7-y b fcftfW'M 79 -i V<n9 »T4 fDVXT—V 
.tl*tt?& -9 T t a#£HRfrC& i J: 3 t-t 4 fcftfc 

[0083] 'JrfMWft 1 <0^t LT*W*f flfc 

ffl/Sli, J^*wa**r3*i4*»£ 3*»***7 >r 
b'£**UifMH;:£a6S^#&V^i?;&l>. -e^Jt 

hZbtfX'ZZ. 

[0084] fl&ot 3 loofij^i. i&45-2*«9ft2<Oi8r 
*MW&*>l&MMr>v- b'«7x 7<4 . **«<oft 1 
*>f&*fctM"f *fc*>*>A- b'7x7<W*-y bT* 

<7)ft20#^4 0 i>(Sv^rnj/7-A7 b'l'T.SrlrtSift 

*tofl£v a 7 b* £3rt s t&<£ffl<9ftf- y \- 
b* -7 x 7 <W-fc -y b 1 i i: #T'# § . ZMz 4 
0, a^^-H^xTtfJ^StJitfiM^t^eB^'Ja 

ymma 4 ixiggm* sfiwr s £ t &vt h . 

[0085] rfWW^oWfjrfrfcMSMBWC* 
[00863 09i4, f t x7/Hfe*«r£tfS*S'*$r* 

a vxf— 04 1 mmiz. mwrth>u 7? 

[00 87] 095:05 fcit^St, Plli7x7fX 
r-v\ P2J4**x^-S^ P3(47b'W-7.H-gXf 
-x\ P4J47?-feXXr-v\ P5t4'J-b*Xf- 

ii****-. Ei>4im4» Efc4l>'FyN'x£;frL£7 
4 b7?-fe**<eiveft£H-. ^75^iDV^ 
(-f&*>*>. ££tl>,ri:3:<y-b't>4tf 
b7?-t^2:filtS^^^T1lfir-C-#l»J:5{c-fl. 

[0088] 01 0J4. faWt'J7;t^<0 
#£OJK*£jjr$\ <eft«4, MPJ1x£-&X/t'VV&2otf) 



(10) #^2 0 00-2 15 06 1 

1 8 

0 1 0 tfOr x 7/M * U tIMrti . V 7 b r x b 

mm. zz-cum&MtbitW&ii. *ta± s 2-0 

<r)7a?7J±Zixrzi'vy)VX*:>)79*.xft<%itzt 
Ufa v)M y-^T-t V7yX"§5<§7u*. -ytfrtT'^ 
LXBfSLZtih. tth-h. ZcoM^^lt. if 3.71V 
^<n£oliZ7u7yVl l Zk'>X7u?yJ>.%ixtZ')7 
V7o7y^tihZ\bli^\ Z 

mmmz 4 <o * * 0 79 *x&mz&mt& z t 

10 ih, TEOttJ&Oflmt, V7hfa7/^li, r 

x7yi^jg^7V 3 vtt-fhWmrYV-visvr^z 

-f X^7^f -f *LC«JiJ8»t: J: 9ttttl*l±S:aW-t 
1. 4 3 ICV 7 b rx T/Hf^-Srflr^fW-S £ fc 

[0 089] V7hrx7;l^^-{45h'>y h^^7-f- 
)VY7 0 1lz£^X®fcZtl. 01 0t*-t4 3Kffl8iJ 
OTIB<0#^7 -f -IV Ytm8£1XX\^h . ?7~7 <- 

iv kwm xt4, #^^-^ yru p< yf-y 3 yfcM-f 

20 6«ttWli»i:LTtt6. 

- £fl£ft-^t7*-V'yM4. 2OC0rD^7A$tL 

-•7 -y bc7)W4 0 1>7^^ < &£»&^4 a . 

- ^7^77WXI18<«tW. 

[0090] TIBOt <r>tf?77 4-)VY10\ liZffi 
<, 

- « 1 <D#^^W7-i)^m?-§-7 < -iv Y<nmti 0 

30 2, 

- ^l<0^fcSttSia^^'J7KWX (XXXM 

mm) 1 0 3axx/m2^^zn-fh^&^^)7Y 

VX (YYYMMM) 7 0 4 £-&tri£&7K W7 - 
;Pb'703/704. 

- Jg 1 <D^^W*afm#7 4 -IVY 7 0 5«« 
«. 

- mi^^K*f7ST-^7n-7-<-;WK7 0 
6. 

- ft 2 «^>»l^fcW*»fWJP*7 ^ K 
40 7 0 7. 

- ft2(0^eJtri»r-^7n-7^-yl/K7 0 
8„ 

[0091 ] Itztr-yX. V 7 hrx7;U*^tStt* 
*S£7h*W.*«4. IfiOffilWfarmKHtit 
OtV7brx7;l^r^l«J^tfflSC«^$^l>. * 
ilfciO, wm-*ft<$947*toi,zt%<7Y\sx 

mzmtiiX'Z «a?7 k vxmomtktr&b 
ti&. ttum&titaMz. ±aufcj:afc. V7h 

50 fc3^*»6T**3. 
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[0092] 2ocr)rn/7A§it7t^cO#^* { f- 
?7YVX%!k (DAGEN) 7 4-lVY*-£tS&&L 

rtfctS^D AG E N7-f —)V F b tfX'Z 
h. ^DAGEN7-f-^H$riSltS^i:^J: l 9. V 

£**T'#I>. 

[0 0 9 3] 111 Hi. 2oc7)Mi^SrV7hf i7 

OflSUWW* 7 2 1, 7 2 2 (iXr-^* 7 2 0 tfl^S 10 

[0 0 94] Xf- y*7 2 3T*'tJ:oC WMW24 
h'-y Hrf*7 2 ltt. * 1 M*I<08 h'-y h^f^Hrf 
7 24t, Vm*A\-ft<F> : i/V?)V*^) (Smem) 
7FU*7 2 5i, bl*JOT-?7D-b-y h 

7 2 6fc*-£A/CHS. S52<D2 4b'-y l-iiH*7 2 2 
S 1 AM M*|<D8 h'-y M*ftf*#7 2 7 1 , &tf)A' 

rt^r-^7D-t'-yb72 9i:5:^-C'V>l). 7.T- 
^7 3 0fcfcWC, 8$ft^b'-y Mi-ffl-m, 20 
4W»flW<-f Y-7 24, 7 2 7rtT TO j fc 
S^lTV^. ^y^;^^UTh'l^X7 2 5, 7 2 8(i 

77KPXh'-yh r Aj 
^-?b'-y h > I j Z&tsXolztjkZtiX^h. *tl 
It. Mm* * V Tt-tXfctt-fZ 7 K W^**ifflS4 fcli 

8b .y vwuzm-iwzm.zmmtizktfx'* 

h. 5 4>fc. 2o<o^tt*tf*ttfc"*-4«BWttr<. DS 
ia*4UHl20>*4fcliltt:*'tf bRfc-rsifctf 30 
"CSS, 

[0 0 9 5] Xx->'*7 3 5(C43V'>T. mi^^cOii 
fPflrf-7 2 4J±2owaMJ-fc^MWSitS. tttf2fiF#7 2 
4 »8 b'-y ^ *> 7 b'-y h £tt *Uf X \>\ *tl 

■efc t m 8 o ~ f F«Bfje«Hrt fc£"c*» * y ^ 

i t tfX't h * * y n - H v «y b y ^ofS*?* s . tk 
W-x7 26. 7 4Oi>,J:tf01 0T'^I»J:3 40 
fc. «tW«Kt:J«-4JlfMWi»ilS*i*. fi^ 

?:?'737i:Sll:fc«J:tf3?2 «MMHC*«-* fep7HU 
X7 3 8iOlSl;:KfiS*U 4 b'-y hi^7KW7 
3 8C0fttES§tl.5. 

[0096] 3 6-Cti, V 7 hf*7/l'M 

5. 7 28O0teiH*SitT^4. £*Ui£T*>#* 50 
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-XTb'UXXXXttdiYYY b3t'»; bg&Z? (M 
MM) fc£J:oT-?-ixefl**S*ll>. 7.T-V7 3 6 
liS&20>Mt0M 1 FftM'vOSS 1 <0*^(cWf 

[0097] *tf)ISII, 010 KflttV 7 h fa7W 

2 >- V t 'J 7 ^ tXft-friZttt h »^-f x 
^.7;Ur a tfift^Z b ififtfrk . 2 -pct)^ y ^ y 
(Smem) ^Xraem, Y m e mT'SjSt"-& 1 
IZ&^X. rV7bT.3.7/Uj 9770 l/737£if 
A-TS^fc+^b'-y h*«jS«cS<t4. V7 bf'A7;P 
^^ifrtci;'?. rn-^Ji^MSr^ty^tLT 
« #-f £ t Ztflfttl £ fc * . ffrfr-fe -y h "7 

•y vyfzmm lxmz vti&ttv < v y*? 8 o~f f 
rt-c»#fl:S-*i* ; t Sr«U-r I. i t ^-c^ , -tnt j: 

-?T, %\<nWt&&7 2A<mmX->Y (t'-yh 
7) S-fjLT^-i-^HflF^fkSJlMftfcfcSKIBK 
tl»C:i:* i T'#l». 

[0098] Wpr<$hm\±. 01 lK*LfcS*y* 

Dt7t, Jtfcitfn>'AM5*^tt7-fe>'7*5tJ:-5 
THJfe^itS. #^7*n-b-y-!fKj:-oTff*>aS^T-y 

rimi 2^c7D-laT'S§^^T^^l.. 

[009 9] Xry7S lttJ^T, ^7a-fe7t 
li. V7h-fa7A^{clS^Sit*?riBtt<0*42o 

^i&m^z'ft ozt i)*r>T~?ttti3 y b o- 
;l/7o-^S^Sr4b^rV^<7)fc-t&^^'J,^. 
b>y l-fltf^tSHHi, 7FWXS/x*l/-^ 'J V-Xt 

b-ticMzmmuzx* y 7^-txo^-^ 

[0 100] L£#oT;Xr'y7S2CfcV v t. 
7nt7tlt DAGENSeRSJWfl-Siktcior 
2^<r)X?VY7u>* : tV$ittm;LXV7bTa. 
7/H8HS- 1 4 i fc ^HffHnStt*«^t4 * 1 <07.t 

[0101] Xf-y7 , S3ti3^'C, WJ7D*7 

y7hf*7;W^^7 3 7«Mt. 01 1 

-f h'fl[fi^tT'>5:< ftfftff J:lX7 H 

o-fe-y^tj:-o-ry7 hTA7;^^* { aj7j^^, 
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[0 1 0 2] HI 3(4. VybfjLTMMt&ttiH 
^7o*x£^1-B&7n-y?0T'&&. 01 314. #<fr 
Ay7ra-7 F 1 0 6*^^)4 8 t'-y Fift^7-F8 
0 0<7)fi-f£*1\ 

[ 0 1 0 3 ] II 1 3 t^-Ti: d K**7- Ka£(c£X 
SfliflW^- (opcode)R **^^fS^@ 

s&wkuis 0 2, 8 o 4 14 . aa»fi t>v& tm y 7 
7>HW** t «tsii*'<# -5 36»*fflafc«as 

*TM79Mm.. r Ej t'-y F£fc(*y 7 hfa7 
;MifW»*aj? UT 7 * --7 7 Mtfl8 0 6 *^fSr^ 
#2754 *yFfc«fci/y-?-y ty^S8 1 sca?- 
J:3£?JH-7*l'?-9-8 0 8fclM , t'ft. xVTVUTK 
U-y^y/«ffl8 1 0fc4:^TaT;UTHW>y^>/i& 
381 2J4M?dfc:«LT. <r^OS«*»fe*KBf€<0 
5T7-b -y FT'K«S:ft.&7 F UX7 -f -/PFtf)ft#£li8 
#rf &£i:#T'# TiT/MS^S8 0 2*>J:tfy 
7 YTT3.TM?7y 4 H«-fW18 0 4<oaj?j(4. 
itfl8 14fcJ:-jTllr&3fU "WW>ru*"*8 1 6** 

a^iist. t^-tivtyv -y^yymms 1 2<7)tfc>J 

(4DAGEN3y ho-zWCjlSfU fjt*(tfltf> 
yy/;y7Hl/7 ^^1*38 1 0<7)tfi?WD AGEN 
nyFo-McaSftS. 
[ 0 1 0 4 ] .BsBUrJ: dfc. iWmx\t. 
(4, £^£flM*£#Jfc**H&<3^<0jra*tf>DAG 
E Nft-f£B$nT!>?S-£D A GE Nflr^S-frtfi t #T 
H^fe^rtWDAGEN^^Iite^DAGENflF 

DAGEN*^£ESLTI8£DAGENflF^7 4--/k 

■^•7 -f KfcHS£7-F 7 <f -IV KO-SfcJgjfct 
■SxIfctfT'^l.. *S£DAGEN7 4-/l'F(CJ:0, H 

[0105] *Wy 7 Ft *T)V&3tTb&%t>1i£. 

wtww 6 mrcc y v -y ey^jWfMf-c** . 

^7-f-;PFy7-yh-y/Ha824(4, x /7 

»o* i otfriMcratf * ffi ?6<o y v -y ^tff ->fc 

ft:. *<0 'J v -y 7 , £ix;tJifFfi!$B£S& 1 OtSr^fflOfg 
^•f&ll8 2 6 Kilt. Rttlts ffrfrtt*>!S2eHfrfr*>fc 
W^T7'f>yM3j:y ; !J77ty/iffl818 
tf. V7bTar;^^a^»a8 0 4O!B*fc:iESL 

(Cji-t. ^77-f^>-bfc<J:t/7^-/UKyv-yty 
^KW8 1 8(4. mmt'-y hi 6, b'y F 2 4 , h'-y 
F 3 2 £fclit*>y F4 O^^mmzm-oXW, 1 Offrfr 
0)7=r-v-y M:jEEtr^2c7)^5ry Ty4 
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[0 106] 01 OfcJ:tf01 3 £#{It£ i: . 01 3 
t^rS#®fiW4^< v 7 r #> <fc 

(4. y7hfa7;H«r^/7 <f -/VFrtOP;r5g£>? 

mjEgLT. er^oy7hTA7;N&^rtols^7H 

UX7 -f -^Fa>£><7)85 1 Wt'J^tWS* 1 fl> 
* * y 7 h* U*t$ J: tf» 2 * 'J <HMC*W 6SS 2 CO 
**y7FUX£*S^H-S. 
[0107] M^J-f *-77H:*-y F«#i&S8 2 0(4. 
10 82«>frfr£SS 1 W#^tM^ta-^LTIIfi : -C# 6A» 
t*d *>£ftfiEf 6 <t 3 fcf«W6 . y 7 hfaT/^ 

uaBW *-t> ( r e j ) t* -y h**4$rvvfc*>. y 

7h7*a7;MrtW«lfctaS<iii:, d^a8 2 0(4r 

[0 108] 014(4. 77 Ft aT/^K'f V*- 
7 x 4 Xfh * ^ 'J A'X^®8ISr *tBS7D -y ? 0T'J> 

y . 0 1 5 (4y 7 F f*7;H^ffl^5 >- F 7 x -y 

[0 109] 014(4. CyN'X7 5 0, DAX7 5 2. 
20 EA'^7 6 0fcJ:t/FyN'X7 6 2S:^L. Ztl^<D^X 

(4. mizgmztix^&tK mm,zm*mtixi&*% 

[0 110] y7hrjt.7/l-7x-y^3yhD-5 7 5 
4(4. 7o-b-yHf3 7 1 0 2O#WMWM6<0--»S» 
flW*. «U4, ^7yK7x7ft8l7 5 6, 7 8 
2S:SIJ»LT. »ltf)f-^7O-yc^7 9 0fcJtf4 
XfcJ:lXY^7yK7 58, 7 8 0tlg2£7)T-^7 
a-ys'X 7 9 2 KJttS X*> ilX V F 7 8 4 , 

78 6fc£CfcJ:tfDAX7 50, 7 52*jfLXZtl 

30 -?*i7 x -y *■•*-•& i 3 cfairt-s . o rn-fe .y-y-n 
7 1 o 2^^mw&m<o~B^m^t s y 7 hr at 

;U?-f hnyhD-7755(4. >9— 
7x-fX794, 7 9 6S:0JfliltT. Slc7)r-^7n 
-A'X 7 9 0 *3 &V : ffr2<7)T—? 7 o-A-7. 792H 
EA*7 6 0fcJ:l/FAX7 6 2^^7> K<7)^&# 

[0 1 1 1 ] 01 5(7)^(4. V7hfa7/^i7f3 
y h D-5 7 5 4 (CJ: -oTff^l, sT^? > H 7 x y f- 
MIHIfP**^. -it(4. X^yF7nyC'fiffcix5# 

40 -^tw^t Jt^^Jf-^-w y 7 b fxW ^ y 

(Cj>tt47r^yF7x yf-70-/\«b5:^-r. L 

4*^(Ci4 . ^5 V F V *JX 9 (4 Dy <X*> £>D-F§ 
^. -eWCty. ^^yS^D^'Xt^^WT'. 2-9" 

mftZtlht. 7i7f3^t-n-7«, Yraem/U 
tJtt4^7yF7x y^7D-Sr^i-C. S** 5 C 
IWi'MlftWt 4><l. 1 5 0 0 fcyrrri; 5 y F 

*D;Uttt4< C><X*^7x -y^-S^I. «k 5 tf 
50 ^7yF#lfcJ:^^5>-F#2(4lRlt-9--f ^ 
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JlsftX-mity xv^ZtiZ. ffitWmWyJ 
-yxjxizfflftZtiZ,. tit Hi, Ev<*g*£F^ 

[0 1 1 2] 01 6«\ Hl^To-fe-ytfl 0£n*Ji^t 

**»EiB4o*>*iaT*&. m&mn. wzmm; 
mm (asic) sflrj^fflLTH3S-ri.-fc* j T# 

^■y h(*l(cjftH-|»JtS , )<Olll8§TffiJi<oaSUOUy, * 
fc(4ffi«ff ESSflJ&fllfiS ti-&ZktfXZZ>. 
[0 1 l3]fcfcitfHl60«lillIBt:rtKSftTV^ 
1,7-u-fe O^id^axyx'ycoioojEffl 
(4, fckifcPSWi*7>f^^««lfll«l<0J:d^ 

-m^-r. ei 7(csrr«fjg^««i. mmmmm. 
7 n f Tf < xrM <r>i 0 %mmtcw&* mm 

IXmtth Z t tfX'Z 5 . TU-fe 7tl0 l4*-/\*>y 

HI 2fc:S»$<t. aflWr^f-df-HT^r^ 

^musuaMfc (rf) 0B**tf7>r 

(4. rn-t»-9-io*#tr*SHilB4 0fc:rti8ELTt; 
*£*»«Htl/Ct>.):v\ RF0I81 6(47^x^18 

[ 0 1 1 4 ] V 7 h^tfAWt'JT^-bX^ 

JR 

2 ffl^tjittS f3-/l^ 1 ^^-t*ttl.T=J- 
vbkTbZb . x 'J 3 yy nvx.x 

[0 1 151ii-CttfflLfc ramnS^ij . rg^§ 
til. j r^fljj t^dmm*. 9Mttia»*Ato 

izmu&mmhi%£t$tr>x . s^wfc^^ns; 

[0116] SBt«(cooTaW»!Ht»"flLT*fc*«. 

wiw*ew)tf*«W)fl!w)3 * y*«rwww<e 

[0117] #SBW1RJHT 1 9 9 8* 1 08 6 Bfcii) 
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ffiZtltlS. N. 98402456. 2 (TI-276 
8 5EU) isXVWUX'l 9 98^1 0.B 6 BfcfflSSS 
tl£S. N. 98402455. 4 (TI-28433 

eu) £.&mzmsttit>0)"ci>*, 

[013 *^0||SfeW^o^rn-fe>y9-c7)%7-D-y 
70T'&&. 

[02 3 0l£7)ro-fe-y-9-O3r^B&0T^I., 

[03 3 01 ®7Dt7W3r»aS'J SrUffx- 
10 >yh^,J:>)B*B^DSyp-y?0T'J>l). 

[04 3 01O7"a*y^ift^/^7T^a-fcJ:tf 

tit$Ta-yayha-y<?)vmX'i>&. 
[05 3 01<?57 , D-fey9-^NM7'^-<y7i-X^ 

[06 3 0 1 <07*n-te •yttiJttl.^r^-f y«0iftf^ 

[07301 Wn-fe y -to/ ^ 7*7 4 ^Ol!)#Sr^ 

[08 3 #4tfttf>0!* SttH-C* I) . 
20 [ 0 9 3 § * r i WWS^ <*1M ? forfflt? 

M.yy^Trctmxhh. 

[01 o 3 y 7 YTx.Tivft^nmmtt-tmx'h 

[ 0 1 1 3 y 7 h t a T/Hfr^<0?&£ Sr^-T B&0T'£> 

[012] y7HT*7;l^fB£<O7n-0"C&$. 
[01 3 1 y 7 h-r^Trt^fcSSfrf *^*>7*o-y 
^0T'$)S„ 

[014] V7 hfaT^Sf f >f y?-7x^x 

30 -t s * t u / -a zaki-mx'h h . 

[015] y7bTJLT;l^^5r'<.7>h'7x-yf-$l| 
[016] 01cO7*O-l:-y^$-rt^tl.*«IIIgS^B§0 

[0 1 7 ] 0 1 coro-fe >y^^rt^tsm»ilfiSB^ 

BS0T*S. 
[ftWRWl 

10 7-f ^07*D-b y1f 

2 0 7'Dt7t^7?7'l/-y 

40 2 2 A-y77"V->V\.X 

2 4 ^df^'yxa^^'J 

2 6 mmms. 

28 ^gp-f>'^-7x^^ 

3 0 UvX^7T^f^ 

32 r- ?r k uxm^ryj-- - y b 

34 ALU 

3 6 Di- y M/y'x^7r^ 
38 DJL^.yhALU 

4 0 DJL-.yhi/7^ 
50 4 2, 4 4 
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7 5 4 * y h h X V y i -)V Y y v ••/ f y 

8 26 §nnm 



[01] 



[02] 



[01 6] 



to 



\ 



100-^ 
20 — 

24 



ASIC/<<y?7U-> 




28 



22 



26 



[H8] 







01 
















ACy- nrt (ACy ♦ <8mani *AC«)) 




Ohfllo Memory 

►>?*«, Hi.? 




ACy -md (ACy + (Snwm 'AC*)) 
llrepttiflcB) 




fM^-iT* t innvn t 
\Mn mpvTwmmwMm y 
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Compound Memory Access Instructions 



This application claims priority to S.N. 98402456.2, filed in Europe on October 6, 
5 1998 (TI-27685EU) and S.N. 98402455.4, filed in Europe on October 6, 1998 (TI- 
28433EU). 

Eeldof the Invention 

10 

The present invention relates to processing engines, and to the parallel execution 
of instructions in such processing engines. 

15 Background of the Invention 

It is known to provide for parallel execution of instructions in microprocessors 
using multiple instruction execution units. Many different architectures are known to 
provide for such parallel execution. Providing parallel execution increases the overall 
20 processing speed. Typically, multiple instructions are provided in parallel in an instruction 
buffer and these are then decoded in parallel and arc dispatched to the execution units. 
Microprocessors are general purpose processing engines which require high instruction 
throughputs in order to execute software running thereon, which can have a wide range of 
processing requirements depending on the particular software applications involved. 
25 Moreover, in order to support parallelism, complex operating systems have been necessary 
to control the scheduling of the instructions for parallel execution. 

Many different types of processing engines are known, of which microprocessors 
are but one example. For example, Digital Signal Processors (DSPs) are widely used, in 
\ particular for specific applications. DSPs are typically configured to optimize the 
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performance of the applications concerned and to achieve this they employ more 
specialized execution units and instruction sets. 

The present invention is directed to improving the performance of processing 
engines such as for example, but not exclusively, digital signal processors. 
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Summary of the Invention 

Particular and preferred aspects of the invention are set out in the accompanying 
independent and dependent claims. Combinations of features from the dependent claims 
5 may be combined with features of the independent claims as appropriate and not merely as 
explicitly set out in the claims. 

In accordance with a first aspect of the invention, there is provided a processing 
engine comprising an instruction buffer operable to buffer single and compound 
instructions pending execution thereof, and a decode mechanism configured to decode 

10 instructions from the instruction buffer. The decode mechanism is configured to be 
responsive to a predetermined tag in a tag field of an instruction, which predetermined tag 
is representative of the instruction being a compound instruction formed from separate 
programmed memory instructions. The decode mechanism is operable in response to the 
predetermined tag to decode at least a first data flow control for a first programmed 

15 instruction and a second data flow control for a second programmed instruction. 

Thus, an embodiment of the invention provides a decode mechanism responsive to 
compound instructions formed (e.g., assembled or compiled) by combining separate 
programmed instructions. In this manner, it is possible to optimize the use of the 
bandwidth available within the processing engine. Appropriate programmed instructions, 

20 such as suitable memory instructions, can thus be assembled, or compiled, to form a 
compound instruction. By generating a separate control flow for each of the constituent 
programmed instructions from the compound instruction, those instructions can be 
performed wholly or partially in parallel with a positive effect on the overall throughput of 
the processing engine. The control flow generated by the decode mechanism for each of 

25 the programmed instructions can be the same as that which would have been generated for 
the programmed instructions if they had been held as single instructions in the instruction 
buffer. 

A compact and efficient encoding can be enabled in an embodiment of the 
invention. For example by ensuring that a memory instruction can only be a first of a pair 
30 of instructions in the instruction buffer in the form of a predetermined compound 
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instruction, parallelism of memory access instructions can be provided with efficient 
encoding, efficient use of real estate and reduced power consumption. 

In an embodiment of the invention, the compound instruction is defined as a soft 
compound memory instruction formed by combining (e.g. using an instruction 
preprocessing mechanism such as a compiler or an assembler) from separate programmed 
memory instructions. In a particular example, the compound instruction is a soft dual 
memory instruction, that is a dual memory instruction assembled from separate first and 
second programmed memory instructions, although in other examples more than two 
instructions can be assembled into a compound instruction. 

Preferably, the decode mechanism is operable to decode a first memory address for 
a first programmed memory address instruction and a second memory address for a 
second programmed memory instruction from a compound memory address field in the 
compound instruction. Particularly, where the compound address field of the compound 
instruction is at the same bit positions as the address field for a hard programmed dual 
memory instruction, this can have a positive effect on instruction throughput. In this case 
the decoding of the addresses can be started before the operation code of the instructions 
have been decoded regardless of the format of first and second instructions of a dual 
instruction. 

In order to reduce the number of bits required for the compound instruction, the 
memory addresses in the compound address field of the compound instruction can be 
arranged to be indirect addresses^ whereby the decode mechanism needs only to be 
operable to decode indirect addresses for such instructions. As dual instructions support 
less options than single instructions, the size of a post modification field for the addresses 
can be reduced, thereby reducing the number of bits required for the addresses themselves 
and also to dispense with an indirect/direct indicator bit. 

A memory access instruction can be constrained to be a first instruction of a pair of 
instructions in the instruction buffer. In this case a soft dual instruction effectively 
provides an encoding corresponding to two memory instructions. As a result, the need for 
a parallel enable field can be avoided, any memory instruction being implicitly capable of 
parallelism. This also provides further advantages of providing a reduction of an 
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application code size, with optimization of external interface bandwidth and a reduction of 
cache misses. 

The decoder for the second instruction of an instruction pair can also be made as a 
subset of the decoder for the first instruction resulting in a reduction in the integrated 
5 circuit real estate required and a reduction in power consumption for the processing 
engine. 

In order to provide a compact instruction format and to enable the address field to 
be located at the same position as for a hard compound instruction, the compound 
instruction can comprise a split operation code field for a first instruction of the 
10 predetermined compound instruction. The operation code can be spilt either side of the 
address field, for example. The decoder can be response to detection of the appropriate 
tag field to decode the split operation code for the first instruction of the compound 
instruction. 

In order to further reduce the number of bits, the compound instruction can 

15 comprise a reduced operation code field for at least the first instruction of the 
predetermined compound instruction such that the operation code field comprises fewer 
bits that the operation code field of the first programmed instruction. By restricting the 
range of operation codes for memory instructions to be within a certain range or ranges, 
the number of bits which need to be provided for the first operation code can be reduced. 

20 The decode mechanism can be arranged to be responsive to the predetermined tag to 
decode a reduced size operation code for the first instruction of the compound instruction. 

With the various measures mentioned above, the predetermined compound 
instruction can be arranged to have the same number of bits in total as the sum of the bits 
of the separate programmed instructions. Reorganization of the fields from the 

25 programmed instructions can lead to the predetermined compound instruction having a 
common overall format with other instructions. 

Where each programmed instruction has a data address generation (DAGEN) code 
field, the individual DAGEN codes of the individual programmed instructions could be 
combined into a combined DAGEN code tield within the compound instruction. This 

30 could provide more rapid decoding and execution of the compound instruction. The 
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combined DAGEN code field could form part of a combined address field. Where a 
combined DAGEN code field is provided, the decode mechanism can be operable to 
respond to a predetermined DAGEN tag to decode the combined DAGEN field. 

The processing engine can be provided with a data fetch controller operable to 
fetch, in parallel, first and second operands from addresses identified by the first and 
second memory addresses, respectively. A data write controller can also be operable to 
write in parallel the result of first and second data flow operations for the first and second 
instructions, respectively. Also, dual read/write operations can be provided. 

In an embodiment of the invention, assembler syntax can differentiate between 
hard compound and soft compound syntax to provide visibility for available slots for 
parallelism. A hard compound instruction can be executed in parallel with a non-memory 
instruction such as a control flow or register instruction as indicated by a parallel enable 
bit and as long as there are no bus/operator resource conflicts. 

In accordance with another aspect of the invention, there is provided a processor, 
for example, but not necessarily, a digital signal processor, comprising a processing engine 
as described above. The processor can be implemented as an integrated circuit, for 
example as an Application Specific Integrated Circuit (ASIC). 

A digital signal processing system comprising a processing engine as described 
above can also be provided with an instruction preprocessing mechanism operable to 
combine separate programmed memory instructions to form a compound memory 
instructioa The instruction preprocessor can be in the form of a compiler, assembler, etc., 
which is operable to compile or assemble compound instructions from programmed 
instructions. The mechanism can be configured to be operable to determine whether the 
separate programmed memory instructions may be combined prior to assembly of the 
compound instruction. 

In accordance with a further aspect of the invention, there is provided an 
instruction preprocessor for a digital signal processing system, the instruction 
preprocessor being configured to be operable: 

to determine programmed memory instructions capable of being combined; and 
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to assemble a compound memory instruction from said determined programmed 
memory inst ructions. 

It should be understood that in the present context the term "instruction 
preprocessor" is to be understood broadly to cover any mechanism for preprocessing 
instructions, that is compiling and/or assembling instructions, including compilers, 
assemblers, etc. 

The instruction preprocessor may be provided separately, for example on a carrier 
medium such as a data storage medium (a disc, solid state memory, a data transmission 
medium such as an electrical, optical or other electromagnetic (e.g. wireless transmission 
medium)). 

In accordance with another aspect of the invention, there is provided a method of 
improving the performance of a processing engine. The method includes: 

buffering a compound instruction assembled from separate programmed memory 
instructions, the compound instruction including a tag Geld containing a predetermined 
compound instruction tag; and 

responding to the predetermined compound instruction tag in the tag field of an 
instruction in the instruction buffer to decode, from the compound instruction, at least first 
data flow control for a first programmed instruction and second data flow control for a 
second programmed instruction. 
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r- Brief Description of the Drawings 

Particular embodiments in accordance with the invention will now be described, by 
way of example only, and with reference to the accompanying drawings in which like 
5 reference signs are used to denote like parts, unless otherwise stated, and in which: 

Figure 1 is a schematic block diagram of a processor in accordance with an 
embodiment of the invention; 

Figure 2 is a schematic diagram of a core of the processor of Figure 1; 

Figure 3 is a more detailed schematic block diagram of various execution units of 
10 the core of the processor of figure 1; 

figure 4 is a schematic diagram of an instruction buffer queue and an instruction 
decoder controller of the processor of Figure 1; 

figure 5 is a representation of pipeline phases of the processor of Figure 1; 

figure 6 is a diagrammatic illustration of an example of the operation of a pipeline 
15 in the processor of figure 1; 

Figure 7 is a schematic representation of the core of the processor for explaining 
the operation of the pipeline of the processor of Figure 1 ; jT 

figure 8 illustrates examples of instruction pairs; 

figure 9 illustrates the relative timing of bus cycles for various instructions; 
20 Figure 10 illustrates an example of the execution of a soft dual instruction; 

figure 11 is a schematic diagram illustrating the generation of a soft dual 
instruction. 

Figure 12 is a flow diagram of the generation of a soft dual instruction; 

Figure 13 is a block diagram of a structure for executing a soft dual instruction; 
25 Figure 14 illustrates memory bus interfacing for a soft dual instruction operation; 

Figure 15 is a table illustrating operand fetch control for a soft dual instruction. 

Figure 16 is a schematic representation of an integrated circuit incorporating the 
processor of Figure 1; and 

Figure 17 is a schematic representation of a telecommunications device 
30 incorporating the processor of Figure 1. 
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Description of Particular Embodiments 

Although the invention finds particular application to Digital Signal Processors 
(DSPs), implemented for example in an Application Specific Integrated Circuit (ASIC), it 
5 also finds application to other forms of processing engines. 

Figure 1 is a block diagram of a microprocessor 10 which has an embodiment of 
the present invention- Microprocessor 10 is a digital signal processor ("DSP'). In the 
interest of clarity, Figure 1 only shows those portions of microprocessor 10 that are 
relevant to an understanding of an embodiment of the present invention. Details of 
10 general construction for DSPs are well known, and may be found readily elsewhere. Fbr 
example, U.S. Patent 5,072,418 issued to Frederick Boutaud, et al, describes a DSP in 
detail and is incorporated herein by reference. U.S. Patent 5,329,471 issued to Gary 
Swoboda, et al, describes in detail how to test and emulate a DSP and is incorporated 
herein by reference. Details of portions of microprocessor 10 relevant to an embodiment 
15 of the present invention are explained in sufficient detail hereinbelow, so as to enable one 
of ordinary skill in the microprocessor art to make and use the invention. 

Several example systems which can benefit from aspects of the present invention 
are described in U.S. Patent 5,072,418, which was incorporated by reference herein, 
particularly with reference to Figures 2-18 of U.S. Patent 5,072,418. A microprocessor 
20 incorporating an aspect of the present invention to improve performance or reduce cost 
can be used to further improve the systems described in U.S. Patent 5,072,418. Such 
systems include, but are not limited to, industrial process controls, automotive vehicle 
systems, motor controls, robotic control systems, satellite telecommunication systems, 
echo canceling systems, modems, video imaging systems, speech recognition systems, 
25 vocoder-modem systems with encryption, and such. 

A description of various architectural features and a description of a complete set 
of instructions of the microprocessor of Figure 1 is provided in co-assigned application 
Serial No. If^f^f (Tl-28433), which is incorporated herein by reference. 

The basic architecture of an example of a processor according to the invention will 
30 now be described. 
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Figure 1 is a schematic overview of a processor 10 forming an exemplary 
embodiment of the present invention. The processor 10 includes a processing engine 100 
and a processor backplane 20. In the present embodiment, the processor is a Digital 
Signal Processor 10 implemented in an Application Specific Integrated Circuit (ASIC). 
5 As shown in figure 1, the processing engine 100 forms a central processing unit 

(CPU) with a processing core 102 and a memory interface, or management, unit 104 for 
interfacing the processing core 102 with memory units external to the processor core 102. 

The processor backplane 20 comprises a backplane bus 22, to which the memory 
management unit 104 of the processing engine is connected. Also connected to the 
10 backplane bus 22 is an instruction cache memory 24, peripheral devices 26 and an external 
interface 28. 

It will be appreciated that in other embodiments, the invention could be 
implemented using different configurations and/or different technologies. For example, 
the processing engine 100 could form the processor 10, with the processor backplane 20 

15 being separate therefrom. The processing engine 100 could, for example be a DSP 

separate from and mounted on a backplane 20 supporting a backplane bus 22, peripheral 
and external interfaces. The processing engine 100 could, for example, be a 
microprocessor rather than a DSP and could be implemented in technologies other than 
ASIC technology. The processing engine, or a processor including the processing engine, 

20 could be implemented in one or more integrated circuits. 

Figure 2 illustrates the basic structure of an embodiment of the processing core 
102. As illustrated, the processing core 102 includes four elements, namely an Instruction 
Buffer Unit (I Unit) 106 and three execution units. The execution units are a Program 
Flow Unit (P Unit) 108, Address Data How Unit (A Unit) 110 and a Data Computation 

25 Unit (D Unit) 112 for executing instructions decoded from the Instruction Buffer Unit (I 

Unit) 106 and for controlling and monitoring program flow. 

Figure 3 illustrates the P Unit 108, A Unit 110 and D Unit 112 of the processing 
core 102 in more detail and shows ihe bus structure connecting the various elements of 
the processing core 102. The P Unit 108 includes, for example, loop control circuitry, 

30 GoTo/Branch control circuitry and various registers for controlling and monitoring 

TI-27685 - 10 - 



11/16/2006, EAST Version: 2.0.3.0 



(2 9) 0 0 0-2 1 506 1 

program flow such as repeat counter registers and interrupt mask, flag or vector registers. 
The P Unit 108 is coupled to general purpose Data Write busses (EB, FB) 130, 132, Data 
Read busses (CB, DB) 134, 136 and an address constant bus (KAB) 142. Additionally, 
the P Unit 108 is coupled to sub-units within the A Unit 110 and D Unit 112 via various 
busses labeled CSR, ACB and RGD. 

As illustrated in Figure 3, in the present embodiment the A Unit 110 includes a 
register file 30, a data address generation sub-unit (DAGEN) 32 and an Arithmetic and 
Logic Unit (ALU) 34. The A Unit register file 30 includes various registers, among which 
are 16 bit pointer registers (AR0-AR7) and data registers (DR0-DR3) which may also be 
used for data flow as well as address generation. Additionally, the register file includes 16 
bit circular buffer registers and 7 bit data page registers. As well as the genera] purpose 
busses (EB, FB, CB, DB) 130, 132, 134, 136, a data constant bus 140 and address 
constant bus 142 are coupled to the A Unit register file 30. The A Unit register file 30 is 
coupled to the A Unit DAGEN unit 32 by unidirectional busses 144 and 146 respectively 
operating in opposite directions. The DAGEN unit 32 includes 16 bit X/Y registers and 
coefficient and stack pointer registers, for example for controlling and monitoring address 
generation within the processing engine 100. 

The A Unit 110 also comprises the ALU 34 which includes a shifter function as 
well as the functions typically associated with an ALU such as addition, subtraction, and 
AND, OR and XOR logical operators. The ALU 34 is also coupled to the general- 
purpose busses (EB, DB) 130, 136 and an instruction constant data bus (KDB) 140. The 
A Unit ALU is coupled to the P Unit 108 by a PDA bus for receiving register content 
from the P Unit 108 register file. The ALU 34 is also coupled to the A Unit register file 
30 by busses RGA and RGB for receiving address and data register contents and by a bus 
RGD for forwarding address and data registers in the register file 30. 

As illustrated, the D Unit 112 includes a D Unit register file 36, a D Unit ALU 38, 
a D Unit shifter 40 and two multiply and accumulate units (MAO, MAC2) 42 and 44. 
The D Unit register file 36, D Unit ALU 38 and D Unit shifter 40 are coupled to busses 
(EB, FB, CB, DB and KDB) 130, 132, 134, 136 and 140, and the MAC units 42 and 44 
are coupled to the busses (CB, DB, KDB) 134, 136, 140 and data read bus (BB) 144. 
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The D Unit register file 36 includes 40-bit accumulators (AC0-AC3) and a 16-bit 
transition register. The D Unit 112 can also utilize the 16 bit pointer and data registers in 
the A Unit 110 as source or destination registers in addition to the 40-bit accumulators. 
The D Unit register file 36 receives data from the D Unit ALU 38 and MACs 1&2 42, 44 
over accumulator write busses (ACWO, ACW1) 146, 148, and from the D Unit shifter 40 
over accumulator write bus (ACW1) 148. Data is read from the D Unit register file 
accumulators to the D Unit ALU 38, D Unit shifter 40 and MACs 1&2 42, 44 over 
accumulator read busses (ACRO, ACR1) 150, 152. The D Unit ALU 38 and D Unit 
shifter 40 are also coupled to sub-units of the A Unit 108 via various busses labeled EFC, 
DRB, DR2 and ACB. 

Referring now to Figure 4, there is illustrated an instruction buffer unit 106 
comprising a 32 word instruction buffer queue (IBQ) 502. The IBQ 502 comprises 32x16 
bit registers 504, logically divided into 8 bit bytes 506. Instructions arrive at the IBQ 502 
via the 32-bit program bus (PB) 122. The instructions are fetched in a 32-bit cycle into 
the location pointed to by the Local Write Program Counter (LWPC) 532. The LWPC 
532 is contained in a register located in the P Unit 108. The P Unit 108 also includes the 
Local Read Program Counter (LRPC) 536 register, and the Write Program Counter 
(WPC) 530 and Read Program Counter (RPC) 534 registers. LRPC 536 points to the 
location in the IBQ 502 of the next instruction or instructions to be loaded into the 
instruction decoders) 512 and 514. That is to say, the LRPC 534 points to the location in 
the IBQ 502 of the instruction currently being dispatched to the decoders 512, 514. The 
WPC points to the address in program memory of the start of the next 4 bytes of 
instruction code for the pipeline. For each fetch into the IBQ, the next 4 bytes from the 
program memory are fetched regardless of instruction boundaries. The RPC 534 points to 
the address in program memory of the instruction currently being dispatched to the 
decoder(s) 512 and 514. 

The instructions are formed into a 48-bit word and are loaded into the instruction 
decoders 512, 514 over a 48-bit bus 516 via multiplexors 520 and 521. It will be apparent 
to a person of ordinary skill in the art that the instructions may be formed into words 
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comprising other than 48-bits, and that the present invention is not limited to the specific 
embodiment described above. 

The bus 516 can load a maximum of two instructions, one per decoder, during any 
one instruction cycle. The combination of instructions may be in any combination of 
5 formats, 8, 16, 24, 32, 40 and 48 bits, which will fit across the 48-bh bus. Decoder 1, 512, 
is loaded in preference to decoder 2, 514, if only one instruction can be loaded during a 
cycle. The respective instructions are then forwarded on to the respective function units in 
order to execute them and to access the data for which the instruction or operation is to be 
performed. Prior to being passed to the instruction decoders, the instructions are aligned 

10 on byte boundaries. The alignment is done based on the format derived for the previous 
instruction during decoding thereof. The multiplexing associated with the alignment of 
instructions with byte boundaries is performed in multiplexors 520 and 521. 

The processor core 102 executes instructions through a 7 stage pipeline, the 
respective stages of which will now be described with reference to Figure 5. 

15 The first stage of the pipeline is a PRE-FETCH (P0) stage 202, during which stage 

a next program memory location is addressed by asserting an address on the address bus 
(PAB) 118 of a memory interface, or memory management unit 104. 

In the next stage, FETCH (PI) stage 204, the program memory is read and the I 
Unit 106 is filled via the PB bus 122 from the memory management unit 104. 

20 The PRE-FETCH and FETCH stages are separate from the rest of the pipeline 

stages in that the pipeline can be interrupted during the PRE-FETCH and FETCH stages 
to break the sequential program flow and point to other instructions in the program 
memory, for example for a Branch instruction. 

The next instruction in the instruction buffer is then dispatched to the decoder/s . 

25 512/514 in the third stage, DECODE (P2) 206, where the instruction is decoded and 
dispatched to the execution unit for executing that instruction, for example to the P Unit 
108, the A Unit 110 or the D Unit 112. The decode stage 206 includes decoding at least 
part of an instruction including a first part indicating the class of the instruction, a second 
part indicating the format of the instruction and a third part indicating an addressing mode 

30 for the instruction: 
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The next stage is an ADDRESS (P3) stage 208, in which the address of the data to 
be used in the instruction is computed, or a new program address is computed should the 
instruction require a program branch or jump. Respective computations take place in the 
A Unit 110 or the P Unit 108 respectively. 
5 In an ACCESS (P4) stage 210 the address of a read operand is output and the 

memory operand, the address of which has been generated in a DAGEN X operator with 
an Xmem indirect addressing mode, is then READ from indirectly addressed X memory 
(Xmem). 

The next stage of the pipeline is the READ (P5) stage 212 in which a memory 
10 operand, the address of which has been generated in a DAGEN Y operator with an Ymem 
indirect addressing mode or in a DAGEN C operator with coefficient address mode, is 
READ. The address of the memory location to which the result of the instruction is to be 
written is output. 

In the case of dual access, read operands can also be generated in the Y path, and 
15 write operands in the X path. 

Finally, there is an execution EXEC (P6) stage 214 in which the instruction is 
executed in either the A Unit 110 or the D Unit 112. The result is then stored in a data 
register or accumulator, or written to memory for Read/Modify/Write or store instructions. 
Additionally, shift operations are performed on data in accumulators during the EXEC 
20 stage. 

The basic principle of operation for a pipeline processor will now be described 
with reference to Figure 6. As can be seen from Rgure 6, for a first instruction 302, the 
successive pipeline stages take place over time periods T t -T 7 . Each time period is a clock 
cycle for the processor machine clock. A second instruction 304, can enter the pipeline in 

25 period T^ since the previous instruction has now moved on to the next pipeline stage. For 
instruction 3, 306, the PRE-FETCH stage 202 occurs in time period T 3 . As can be seen 
from Figure 6 for a seven stage pipeline a total of 7 instructions may be processed 
simultaneously. For all 7 instructions 302-314, Figure 6 shows them all under process in 
time period T 7 . Such a structure adds a form of parallelism to the processing of 

30 instructions. 
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As shown b Figure 7, the present embodiment of the invention includes a memory 
management unit 104 which is coupled to external memory units (not shown) via a 24 bit 
address bus 114 and a bi-directional 16 bit data bus 116. Additionally, the memory 
management unit 104 is coupled to program storage memory (not shown) via a 24 bit 
5 address bus 118 and a 32 bit bi-directional data bus 120. The memory management unit 
104 is also coupled to the I Unit 106 of the machine processor core 102 via a 32 bit 
program read bus (PB) 122. The P Unit 108, A Unit 110 and D Unit 112 are coupled to 
the memory management unit 104 via data read and data write busses and corresponding 
address busses. The P Unit 108 is turther coupled to a program address bus 128. 

10 More particularly, the P Unit 108 is coupled to the memory management unit 104 

by a 24 bit program address bus 128, the two 16 bit data write busses (EB, FB) 130, 132, 
and the two 16 bit data read busses (CB, DB) 134, 136. The A Unit 110 is coupled to the 
memory management unit 104 via two 24 bit data write address busses (EAB, FAB) 160, 
162, the two 16 bit data write busses (EB, FB) 130, 132, the three data read address 

15 busses (BAB, CAB, DAB) 164, 166, 168 and the two 16 bit data read busses (CB, DB) 
134, 136. The D Unit 112 is coupled to the memory management unit 104 via the two 
data write busses (EB, FB) 130, 132 and three data read busses (BB, CB, DB) 144, 134, 
136. 

Figure 7 represents the passing of instructions from the I Unit 106 to the P Unit 
20 108 at 124, for forwarding branch instructions for example. Additionally, Figure 7 
represents the passing of data from the 1 Unit 106 to the A Unit 110 and the D Unit 112 at 
126 and 128 respectively. 

In a particular embodiment of the invention, the processing engine 100 is 
responsive to machine instructions in a number of formats. Examples of such instructions 
25 in different formats are illustrated in the following. 

8 Bit instruction : OOOO OOOO 

This represents an eight bit instruction, for example a memory map qualifier 
(MMAP()) or a read port qualifier (readportO). Such a qualifier comprises merely an 
30 eight bit opcode (OOOO OOOO). In such a case parallelism is implicit. 
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16 Bit Instruction : OOOO OOOE FSSS FDDD 

This represents an example of a sixteen bit instruction, for example an instruction 
where the content of a destination register (e.g., dst) becomes the sum of the prior content 
of that register (dst) and the content of a source register (sre), that is: 

dst » dst + sre 

Such an instruction comprises a seven bit opcode (0000 OOO) with a one bit 
parallel enable field (E), a four bit source register identifier (FSSS) and a four bit 
destination register identifier (FDDD). 

16 Bit Instruction : 0000 FDDD PPPM MMMI 

This represents another example of a sixteen bit instruction, for example where the 
content of a destination register (e.g., dst) becomes the content of a memory location 
(Smem), that is: 

dst = Srnem 

Such an instruction comprises a four bit opcode (OOOO), a four bit destination 
register identifier (FDDD), a three bit pointer address (PPP), a four bit address modifier 
(MMMM) and a direct/indirect address indicator (I). 

24 Bit Instruction : OOOO OOOE LLLL LLLLoCCC CCCC 

This represents an example of a twenty four bit instruction, for example a 
conditional instruction for a branch to and offset (L8) where a condition is met, that is: 

if(cond) goto L8 

Such an instruction comprises a seven bit opcode (OOOO 000) with a one bit 
parallel enable field (E), an eight bit branch offeet (LLLL LLLL), a one bit opcode 
extension (o) and a seven bit condition field (CCC CCCC). 

24 Bit Instruction : OOOO OOOO PPPM MMMI SSDD ooU% 

This is another example of a twenty-four bit instruction, for example a single 
memory operand instruction where the content of an accumulator (AC y ) becomes the 

TI-27685 - 16 - 



11/16/2006, EAST Version: 2.0.3.0 



(35) #BB2000-2 1 5 

result of rounding the sum of the content of another accumulator (AC r ) and the square of 
the content of a memory location (with optional rounding), and optionally the content of a 
data register (DR3) can become the content of the memory location, that is: 
AC y = rnd( AC, * Smem * Smera), DR3 = Smem 
Such an instruction comprises an eight bit opcode (OOOO OOOO), a three bit 
pointer address (PPP), a four bit address modifier (MMMM), a one bit direct/indirect 
address indicator field (I), a two bit source accumulator identifier (SS), a two bit 
destination accumulator identifier (DD), a two bit opcode extension (oo), an update 
condition field (u), and a one bit rounding option field (%). 

32 Bit Instruction : 0000 OOOO PPPM MMMI KKKK KKKK KKKK KKKK 

This is an example of a thirty-two bit instruction, for example an instruction where 

the content of a test register (TCI) is set to 1 or 0 depending on the sign comparison of a 

memory location (Smem) to a constant value (K16), that is: 

TCI = (Smem == K16) 
Such an instruction comprises an eight bit opcode (OOOO OOOO), a three bit 

pointer address (PPP), a four bit address modifier (MMMM), a one bit direct/indirect 

address indicator field (I) and a sixteen bit constant field (KKKK KKKK KKKK KKKK). 

Hard Dual Instruction: OOOO OOOO XXXM MMYY YMMM SSDD ooox ssU% 

This is an example of a 32 bit dual access instruction, which could be termed a 
"hard dual access instruction'', or a hard programmed dual memory instruction, that is a 
dual instruction which has been programmed as such, for example, by a programmer. 
Such an instruction requires two DAGEN operators. A second instruction can be 
executed in parallel. This is typically a register or control instruction. Memory stack 
instructions can also be executed in parallel as long as there are no bus conflicts. An 
example of such an instruction is: 

C y = rnd(DR, * Xmem), 
Ymem = HI(AC K «DR2) 
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DR3 = Xraera 

This instruction comprises an eight bit opcode (OOOO OOOO), a three bit Xmem 
pointer address (XXX) with a four bit address modifier (MMMM), a three bit Yraera 
pointer address (YYY) with a four bit address modifier (MMMM), a two bit source 
5 accumulator (ACJ identifier (SS), a two bit destination accumulator (AC y ) identifier (DD), 
a three bit opcode extension (ooo), a don't care bit (x), a two bit source accumulator 
identifier (ss), a one bit optional DR3 update field (U) and a one bit optional rounding 
field {%). 

Figure 8 is a table illustrating combinations of instructions forming instruction pairs 
10 and also a soft dual instruction. In such instruction pairs, the first instruction of the pair is 
always a memory operation. It will be noted that where the second instruction is also a 
memory instruction, then this is configured as a soft dual instruction, that is a compound 
instruction. 

Instructions which may be located in a second position of an instruction pair (i.e. 

15 for the higher program address of the pair) include a parallel enable field (E bit) to indicate 
whether the instruction can be performed in parallel with the first of a pair of instructions. 
The parallel enable bit is located at a predetermined offset from the instruction format 
boundary between the instructions. The decoder is arranged to be responsive to the 'E 7 
bit in order to control instruction execution. 

20 The reason for having a memory operation first in an instruction pair is that at the 

entry to the address decode stage of the processor pipeline, the decoder does not know 
the format of the instruction, or even where the format boundary is located. Memory 
address decoding is one of the critical stages of the pipeline to ensure good instruction 
throughput. Accordingly, it is necessary to be able reliably to know the location and size 

25 of the address bits for a memory instruction to be decoded in order that the decoding can 
commence even before the exact nature of the instruction is determined. 

A further advantage which results from constraining a memory instruction to be 
located as the first instruction in an instruction pair is that it is then not necessary for a 
memory instruction to include a field indicating whether parallel operation is permitted. 

30 This makes the instruction set more efficient and allows improved code size. 
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Yet a further advantage is that the hardware necessary for decoding a second 
instruction of an instruction pair need only be a subset of the hardware for decoding the 
first instruction of the instruction pain The first instruction is the instruction of the 
instruction pair with a lower program address than the second instruction of the 
5 instruction pair. Thus, the decode hardware for the instruction with a higher program 
address of an instruction pair can be a subset of the decode hardware for the instruction 
with a lower program address of an instruction pair. This enables a reduction in the silicon 
area and power consumption required for implementing and operating the decode 
hardware. 

10 Where two instructions of an instruction pair can be performed in parallel, this 

takes place in respective decoding and execution stages. However, due to physical bus 
timing constraints, bus transfers can be staggered. 

Figure 9 illustrates the pipeline stage in which memory access takes place for 
different types of instructions, including dual instructions- It should be noted, as for 

15 Figure 4, that the pipeline stages shown are for illustrative purposes only. In practice, the 
prefetch and fetch stages form a flow separate from that of the remaining stages. 

Comparing Figure 9 with Figure 5, PI represents the fetch stage, P2 the decode 
stage, P3 the address computation stage, P4 the access stage, P5 the read stage and P6 the 
execute stage. B represents a coefficient read access from a register via the B bus. C and 

20 D represent memory read accesses via the C and D busses respectively. E and F represent 
write accesses via the E and F busses respectively. In order that the read and write 
accesses can be performed at the required cycles without causing a bubble (or stall) on the 
pipeline, decoding is performed as early as possible. 

Figure 10 illustrates a particular form of dual memory access instruction. It is 

25 effectively formed from two merged programmed instructions which have implied 
parallelism. The dual memory instruction of Figure 10 is termed a soft dual instruction, or 
also a compound instruction herein. It is formed by combining two programmed single 
memory access instructions in an instruction preprocessor, tor example in a compiler or an 
assembler. In olher words, this compound instruction is not programmed, or pre- 

30 programmed, as a dual instruction by a programmer. This provision of this form of 
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compound instruction enables improved memory access performance by permitting 
parallel operation, with both instructions being executed in the same cycle. In a particular 
example described in the following, the soft dual instruction is restricted to indirect 
addressing with dual modifier options. As a result, it is possible to encode the soft dual 
5 instruction to achieve increased performance through parallel operation with no size 
penalty in respect of the combined instruction size. 

The soft dual instruction is qualified by a five bit tag field 701, with individual 
following instruction fields organized as illustrated in Figure 10. The size of the tag field 
results from constraints relating to the particular implementation, namely: 
10 - that the total encoding format is constrained not be greater than the sum of the 
encoding formats of the two constituent programmed instructions; 

that the total instruction format size is a multiple of 8; and 

the availability of opcodes with respect to other single instructions. 

Following the tag field 701 are: 
15 - part 702 of the operation code field for a first instruction; 

a compound address field 703/704 including an indirect memory address 
(XXXMMM) 703 for the first instruction and an indirect memory address (YYYMMM) 
704 for a second instruction; 

the remainder of the operation code field 705 for the first instruction; 
20 - a daia flow field 706 for the first instruction; 

an operation code field 707 for the operation code of the second instruction; and 

a data flow field 708 for the second instruction. 

It can be seen, therefore, that the combined address portion for the soft dual 
instruction is held at the same location in the soft dual instruction as for any other dual 
25 instruction. This provides the advantage of rapid address decoding as a result of being 
able to commence address decoding without knowledge of the instruction type involved. 
It will be seen that in order to achieve this, some reorganization of the bits in the soft dual 
instruction is necessary, for example as described above. 

In addition to the modifications described above, where two programmed 
30 instructions each comprise a data address generation (DAGEN) field, these could be 
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combined to form a combined DAGEN field in the soft dual instruction. The provision of 
a combined DAGEN field can facilitate and speed subsequent execution of the soft dual 
instruction. 

Figure 11 illustrates various steps in transforming two independent instructions 
5 into a soft dual instruction. 

Two independent instructions 721 and 722 are represented at stage 720. 
As shown at 723, a first 24 bit instruction 721 includes an eight bit operation code 
724 in the first byte, a single memory (Smem) address 725 in the next byte and data flow 
bits 726 in the next byte. A second 24 bit instruction 722 includes an eight bit operation 
10 code 727 in the first byte, a single memory address 728 in the next byte and data flow bits 
729 in the next byte. At 730, the eight operation code bits are each labeled 'O' in the 
operation code bytes 724 and 727 of each of the instructions. The single memory 
addresses 725 and 728 are each shown to comprise 7 address bits l A' plus an 
indirect/direct indicator bit T. This is because addresses for the standard memory 
15 accesses can be either direct or indirect. In the example shown, the granularity is based on 
bytes. However, in other examples a granularity based on other than 8 bits may be 
employed. 

At stage 735, the operation code 724 of the first instruction is split into two parts. 
Only seven of the eight bits of the operation code 724 need to be considered. This is as a 

20 result of memory code mapping which can ensure that this is redundant in the case of a 
soft dual instruction (e.g., by ensuring that all memory instructions have operation codes 
within a determined range, for example, 80-FF in hexadecimal notation, for a soft dual 
instruction). As can be seen later in stages 736 and 740, and also in Figure 10 the 
operation code for the first instruction is split. Three bits of the operation code for the 

25 first instruction are placed between a soft dual instruction tag 737 and the combined 
addresses 738 for the first and second instructions and four bits are placed after the 
combined addresses 738. 

At stage 736, the insertion of a soft dual instruction tag 737 is shown. This as a 
tag which can be interpreted by the decoder as representing a soft dual instruction. Also 

30 shown is the merging of the single memory fields 725 and 728. This can be achieved 
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because all soft dual instructions are restricted to indirect addresses, whereby an 
indirect/direct flag is not needed. The indirect addresses are indicated by a three bit base 
address XXX or YYY, for the first and second instructions, respectively, and a three bit 
modifier (MMM). Stage 736 further illustrates the moving of the data flow for the first 
instruction to the first byte position of the second instruction, with the operation code for 
the second instruction being moved to the second byte position of that instruction. 

As a result, the format of the soft dual instruction represented in Figure 10 is 
achieved. It is to be noted that there is no code size penalty for a soft dual instruction 
versus two single memory access instructions. By replacing two single memory (Smem) 
instructions by an Xmem, Ymem, enough bits are freed up to insert the 'soft duaF tag 
701/737. The soft dual tag by itself allows the decoder to detect that it should decode the 
pair of instructions as memory instructions. Instruction set mapping can be used to ensure 
that memory instructions are encoded within a window 80- FF, whereby the most 
significant bit (bit 7) of the first operation code 724 can be discarded when effecting the 
dual field encoding. 

In the example shown, the various stages illustrated in Figure 11 are performed by 
an instruction preprocessor, for example a compiler or an assembler, when preparing 
instructions for execution. The steps performed by the instruction preprocessor are 
represented in a flow diagram shown in Figure 12. 

In step SI, the instruction preprocessor detects the presence of two instructions 
which might potentially be combined into a soft dual instruction. In order for this to be 
possible, the instructions will need to be such that they may be performed in parallel and 
do not result m data or control flow anomalies. Each instruction within the instruction set 
is qualified by DAGEN variables in a DAGEN tag, which define the address generator 
resources and the type of memory access involved to support the instruction. 

Accordingly, in step S2, the instruction preprocessor performs a first step in 
determining the feasibility of merging two standalone memory instructions into a soft dual 
instruction by analyzing the DAGEN variables. Assuming this checks out, then the 
instruction preprocessor is operable to analyze potential bus and operator conflicts and to 
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establish whether there is a potential bar to the combining of the first and second 
instructions. 

In step S3, the instruction preprocessor then applies the soft dual instruction tag 
737 and modifies the. operation codes and address indications, as well as the field positions 
5 as illustrated in Egure 11. 

In step S4, the soft dual instruction is output by the instruction preprocessor. 
Figure 13 is a schematic block diagram illustrating the decoding process for a soft 
dual instruction. Rgure 13 illustrates the decoding of a 48 bit instruction word 800 from 
the instruction buffer unit 106. 

10 From the operation code (opcode), which is located at the left of the instruction 

word as shown in Figure 13, logic 802, 804 in the opcode decoding circuitry is able 
rapidly to detect whether a built in dual or soft dual instruction is to be decoded. The 
detection of a soft dual tag by tag decoding logic 804 controls a multiplexor 808 to select 
either an "E" bit or the soft dual opcodes to be passed from format logic 806 to 

15 instruction #2 alignment and remapping logic 818. Single addressing logic 810 and dual 

addressing logic 812 are operable in parallel to commence decoding of the address fields, 
which are always located at a determined offset from the left hand end of the instruction. 
Outputs of dual decoding logic 802 and soft dual tag field decoding logic 804 are 
combined by logic 814 and form a control input to a multiplexor 816. Thus, when a dual 

20 instruction is detected, the output of dual addressing logic 812 is passed to the DAGEN 
control, otherwise the output of single addressing logic 810 is passed to DAGEN control. 

As mentioned above, in an alternative form, a compound instruction can comprise 
a combined DAGEN code field replacing the separate DAGEN codes of the pair of 
instructions forming the compound instruction. A DAGEN tag in the compound 

25 instruction could identify the presence of the combined DAGEN code field, with the 
decoder being configured to be responsive to the DAGEN tag to decode the combined 
DAGEN code field. The combined DAGEN code field could form part of the combined 
address field. The provision of a combined DAGEN field can provide advantages in 
execution speed. 
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If the instruction is a soft dual instruction, then remapping is necessary before 
decoding can be performed. Accordingly, instruction field remapping logic 824 is 
responsive to the output of the soft dual tag decoding logic 804 to cause the remapping of 
the information relating to the first instruction of the pair before passing the remapped 
5 operation information to decode logic 826 for the first instruction. Similarly, instruction 
alignment and remapping logic 818 for a second instruction of the instruction pair is 
responsive to the output of the soft dual tag decoding logic 804 to cause remapping of the 
information relating to the second memory instruction prior to passing the information to 
the decode logic 822 for the second instruction. The instruction alignment and field 
10 remapping logic 818 is also operable to realign the second instruction dependent upon the 
format of the first instruction, according to the instruction boundary at bit 16, bit 24, bit 
32 or bit 40, as appropriate. 

With reference to Figures 10 and 13, it can be seen that the decode mechanism 
shown in Figure 13 is configured to decode instructions from the instruction buffer. The 
15 decode mechanism is responsive to a predetermined tag in a tag field of a soft dual 
instruction as shown in Figure 10 to decode a first memory addresses for a first memory 
instruction and a second memory address for a second memory instruction from a 
compound address field in the predetermined soft dual instruction. 

Parallel enable bit decoding logic 820 is operable to validate whether the second 
20 instruction may be decoded and executed in parallel with the first instruction. As a soft 
dual instruction does not include a parallel enable ("E") bit, this logic 820 is disabled 
when a soft dual instruction is detected. 

Figure 14 is a schematic block diagram illustrating aspects of the memory bus 
interfacing for a soft dual instruction, and Figure 15 is a table summarizing the operand 
25 fetch control for a soft dual instruction. 

Figure 14 illustrates the C bus 750, the D bus 752, the E bus 760 and the F bus 
762, which busses were referenced earlier, but were not individually identified. 

A soft dual letch controller 754 forms part of the instruction control functions of 
the processor core 102. This is operable lo control operand fetch mechanisms 756 and 
30 782 to fetch X and Y operands 758 and 780 for a first data flow path 790, and X and Y 
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operands 784 and 786 for a second data flow path 792, respectively, via the C and D 
busses 750 and 752. A soft dual write controller 755, which also forms part of the 
instruction control functions of the processor core 102, is operable to control memory 
write interfaces 794 and 796 to control the writing of operands from the first data flow 
path 790 and the second data flow path 792, respectively to the E and F busses 760 and 
762. 

The table which forms Figure 15 illustrates the operand fetch control operations 
performed by the soft dual fetch controller 754. This illustrates the changes to the 
operand fetch flow for a soft dual memory instruction compared to a single memory 
instruction performed standalone. Thus, when a single memory instruction is executed 
standalone, the operand register is loaded from the D bus, whereby the memory request is 
a D request, thereby requiring two cycles. However, when a soft dual instruction is 
executed, the fetch controller changes the operand fetch flow for the Ymera path, such 
that the request is re-directed to a C request and the operand is fetched from the C bus 
instead of the D bus as indicated at 1500- Advantageously, operand #1 and operand #2 
are fetched in parallel in the same cycle. The same mechanism applies to the write 
interface. For example, an E bus request can be redirected to an Fbus request. 

Figure 16 is a schematic representation of an integrated circuit 40 incorporating 
the processor 10 of Figure 1. The integrated circuit can be implemented using application 
specific integrated circuit (ASIC) technology. As shown, the integrated circuit includes a 
plurality of contacts 42 for surface mounting. However, the integrated circuit could 
include other configurations, for example a plurality of pins on a lower surface of the 
circuit for mounting in a zero insertion force socket, or indeed any other suitable 
configuration. 

One application for a processing engine such as the processor 10, for example as 
incorporated in an integrated circuit as in Figure 16, is in a telecommunications device, for 
example a mobile wireless telecommunications device. Figure 17 illustrates one example 
of such a telecommunications device. In the specific example illustrated in Figure 17, the 
telecommunications device is a mobile telephone 11 with integrated user input device 
such as a keypad, or keyboard 12 and a display 14. The display could be implemented 
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using appropriate technology, as, for example, a liquid crystal display or a TFT display. 
The processor 10 is connected to the keypad 12, where appropriate via a keyboard 
adapter (not shown), to the display 14, where appropriate via a display adapter (not 
shown), and to a telecommunications interface or transceiver 16, for example a wireless 
telecommunications interface including radio frequency (RF) circuitry. The radio 
frequency circuitry could be incorporated into, or separate from, an integrated circuit 40 
comprising the processor 10. The RF circuitry 16 is connected to an aerial 18. 

Thus, there has been described a processing engine which provides for execution 
of soft encoded dual memory access instructions. The soft dual instruction mechanism 
enables execution of two memory access instructions in parallel with high encoding 
efficiency. Due to increased parallelism, power consumption can be reduced. Also, a 
decoder for a second instruction can be a subset of the decoder for a first instruction 
resulting in efficient use of silicon real estate and providing further opportunities for a 
reduction in power consumptioa 

As used herein, the terras "applied, " "connected," and "connection" mean 
electrically connected, including where additional elements may be in the electrical 
connection path. 

While the invention has been described with reference to illustrative embodiments, 
this description is not intended to be construed in a limiting sense. Various other 
embodiments of the invention will be apparent to persons skilled in the art upon reference 
to this description. It is therefore contemplated that the appended claims will cover any 
such modifications of the embodiments as fall within the true scope and spirit of the 
invention. 
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1. A digital system comprising a processing engine, wherein the processing 
engine comprises: 

an instruction buffeT operable to buffer single and compound instructions pending 
execution thereof; and 

a decode mechanism configured to decode instructions from the instruction buffer; 

the decode mechanism being responsive to a predetermined tag in an instruction, 
the predetermined tag being representative of the instruction being a compound instruction 
formed from separate programmed memory instructions, to decode at least first data flow 
control for a first programmed instruction and at least second data flow control for a 
second programmed instruction. 

2. The processing engine according to claim 1, wherein the compound 
instruction is a compound memory instruction formed by combining separate first and 
second programmed memory instructions. 

3. The processing engine according to claim 2, wherein the decode 
mechanism is operable to decode a first memory address for a first programmed memory 
address instruction and a second memory address for a second programmed memory 
instruction from a compound memory address field in the compound instruction. 

4. The processing engine according to claim 3, wherein the compound 
address field of the compound instruction is at the same bit positions as the address field 
for a hard programmed dual memory instruction. 

5. The processing engine according to claim 4, wherein the memory addresses 
in the compound address field of the compound instruction are indirect addresses, the 
decode mechanism being operable to decode the indirect addresses. 
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6. The processing engine according to claim 1, wherein the compound 
instruction comprises a split operation code field for a first programmed instruction of the 
compound instruction. 

5 7. The processing engine according to claim 6, wherein the decode 

mechanism is responsive to the predetermined tag to decode a split operation code for the 
first programmed instruction of the compound instruction. 

8. The processing engine according to claim 7, wherein the compound 
10 instruction comprises an operation code field for a first programmed instruction of the 

compound instruction, which operation code field comprises less bits than the operation 
code field of the first programmed instruction. 

9. The processing engine according to claim 8, wherein the decode 
15 mechanism is responsive to the predetermined tag to decode a reduced size operation code 

for the first programmed instruction of the compound instruction. 

10. The processing engine according to claim 9, wherein the compound 
instruction has the same number of bits in total as the sum of the bits of the separate 

20 programmed instructions. 

11. The processing engine according to claim 1, wherein the compound 
instruction has a combined data address generation (DAGEN) field formed from DAGEN 
fields of the first and second programmed memory instructions. 

25 

12. The processing engine according to claim 11, wherein the combined 
DAGEN field forms part of a combined address field. 
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13. The processing engine according to claim 12, wherein the decode 
mechanism is responsive to a predetermined DAGEN tag to decode the combined 
DAGEN field. 

14. The processing engine according to claim 1, comprising a fetch controller 
operable to fetch in parallel first and second operands from addresses identified by the first 
and second memory addresses, respectively. 

15. The processing engine according to claim 14, comprising a write controller 
operable to write in parallel the result of first and second data flow operations for the first 
and second programmed instructions, respectively. 

16. The processing engine according to claim 1, operable to interpret a 
memory access instruction as implicitly capable of parallel execution, whereby a memory 
access instruction does not including a parallel enable field. 

17. The processing engine according to claim 1, wherein a memory access 
instruction is constrained to be a first programmed instruction of a pair of instructions in 
the instruction buffer. 

18. The digital system of Claim 1 being a cellular telephone, further 
comprising: 

an integrated keyboard connected to the processor via a keyboard adapter; 
a display, connected to the processor via a display adapter; 
radio frequency (RF) circuitry connected to the processor; and 
an aerial connected to the RF circuitry. 

19. The digital system of Claim 1, farther comprising an instruction 
preprocessing means for preparing instructions for execution, the instruction 
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preprocessing means being operable to combine separate 
instructions to form a compound memory instruction. 



#^2 000 
programmed 



-2 1506 1 
memory 



20. A method of improving the performance of a processing engine, the 
method comprising the steps of: 

buffering a compound instruction formed from separate programmed memory 
instructions, the compound instruction including a tag field containing a predetermined 
compound instruction tag; and 

responding to the predetermined compound instruction tag in the tag field of an 
instruction in the instruction buffer to decode, from the compound instruction, at least first 
data flow control for a first programmed instruction and second data flow control for a 
second programmed instruction. 

21. The method according to claim 20, further comprising the step of 
combining separate first and second programmed memory instructions to form the 
compound instruction. 

22. The method according to claim 20, further comprising the step of decoding 
at least a first memory address for the first programmed memory instruction and a second 
memory address for the second programmed memory instruction from a compound 
address field of the compound instruction. 

23. The method according to claim 22, further comprising the step of decoding 
the compound address field of the compound instruction from the same bit positions as for 
the address field for a hard programmed dual memory instruction. 

24. The method according to claim 20, further comprising the step of decoding 
a split operation code for a first instruction of the compound instruction. 
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25. The method according to claim, further comprising decoding a reduced size 
operation code for the first instruction of the compound instruction. 



26. The method according to claim 21, wherein the step of responding 
5 comprises decoding a combined data address generation (DAGEN) field formed from 

DAGEN fields of the first and second programmed memory instructions. 

27. The method according to claim 26, wherein the combined DAGEN field 
forms part of a combined address field. 

10 

28. The method according to claim 26, wherein the decode mechanism is 
responsive to a predetermined DAGEN tag to decode the combined DAGEN field. 

29. The method according to claim 22, further comprising the step of fetching 
15 in parallel first and second operands from addresses identified by first and second memory 

addresses, respectively. 

30. The method according to claim 29, comprising writing in parallel the result 
of first and second data flow operations for first and second programmed instructions, 

20 respectively, of the compound instruction. 

31. The method according to claim 21, wherein the step of combining 
comprises determining whether the separate programmed memory instructions may be 
combined prior to assembly of the compound instruction. 

25 

32. The method according to claim 31, wherein the step of combining further 
comprises: 

determining programmed memory instructions capable of being combined; and 
combining the determined programmed memory instructions to form a compound 
30 memory instruction. 
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\ 4 Abstract 



A processing engine 10 includes an instruction buffer 502 operable to buffer single 
and compound instructions pending execution. A decode mechanism is configured to 
decode instructions from the instruction buffer. The decode mechanism is arranged to 
respond to a predetermined tag in a tag field of an instruction, which predetermined tag is 
representative of the instruction being a compound instruction formed from separate 
programmed memory instructions. The decode mechanism is operable in response to the 
predetermined tag 726 to decode at least first data flow control for a first programmed 
instruction 721 and second data flow control 729 for a second programmed instruction 
722. The use of compound instructions enables effective use of the bandwidth available 
within the processing engine. A soft dual memory instruction can be compiled from 
separate first and second programmed memory instructions. A compound address field 
738 of the predetermined compound instruction can be arranged at the same bit positions 
as the address field for a hard compound memory instruction, that is a compound 
instruction which is programmed. In this case the decoding of the addresses can be started 
before the operation code of the instructions have been decoded. To reduce the number 
of bits in the compound instruction, addressing can be restricted to indirect addressing and 
the operation codes for at least the first instruction can be reduced in size. In this way, the 
compound instruction can be arranged to have the same number of bits in total as the sum 
of the bits of the separate programmed instructions. 
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